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Abstract 


Title  of  Dissertation: 

Helicobacter  pylori  Virulence  Factors  and  Their  Role  in  Pathogenesis 

Kathleen  R.  Jones,  Doctor  of  Philosophy,  2011 

Thesis  directed  by: 

D.  Scott  Merrell,  Ph.D. 

Associate  Professor,  Department  of  Microbiology  and  Immunology 

Helicobacter  pylori  is  a  Gram  negative,  microaerophilic,  spiral  shaped  bacterium 
that  is  the  causative  agent  of  a  variety  of  gastric  maladies:  gastritis,  peptic  ulcers  (both 
duodenal  and  gastric),  and  two  forms  of  gastric  cancer.  Currently,  several  bacterial 
virulence  factors  have  been  associated  with  more  severe  gastric  disease,  including 
specific  polymorphismic  forms  of  two  bacterial  toxins,  CagA  and  VacA.  These  toxins 
have  been  shown  to  have  numerous  effects  on  host  cells,  including  modulation  of 
multiple  cellular  pathways  that  appear  to  ultimately  lead  to  disease.  Through  the  process 
of  completing  this  thesis,  we  were  the  first  group  to  show  an  association  between  East 
Asian  CagA  (EPIYA-ABD)  and  progression  to  gastric  cancer  in  a  South  Korean 
population.  We  also  examined  the  role  of  VacA  polymorphisms  within  that  population, 
and  found  that  while  the  distribution  of  vacA  alleles  was  not  associated  with  disease  state, 
it  was  associated  with  the  distribution  of  cagA  alleles  and  was  integral  in  a  three  way 
interaction  with  the  distribution  of  cagA  alleles  and  disease  state.  Next,  we  analyzed  the 

contribution  of  the  newly  described  i  region  of  VacA  to  disease  development,  and 

iii 


identified  an  amino  acid  (196)  that  was  important  for  the  development  of  gastric  cancer. 
Additionally,  we  identified  some  associations  that  were  CagA-dependent,  such  as  the 
association  of  Vac  A  and  disease  state  in  the  EPIYA-ABD  population  and  the  association 
of  the  distribution  of  amino  acids  at  position  23 1  and  disease  state  in  the  non  EPIYA- 
ABD  population.  Moreover,  we  were  able  to  optimize  techniques  that  will  ultimately  be 
used  to  characterize  CagA  isogenic  strains.  Those  future  studies  will  serve  to  define  the 
role  of  specific  EPIYA  motifs  in  H.  pylori- induced  host  cellular  damages  both  in  vitro 
and  in  vivo.  En  masse,  these  data  add  to  what  we  know  about  the  complexity  of  H. 
pylori -induced  pathogenesis,  and  it  is  becoming  increasingly  more  evident  that 
polymorphisms  within  CagA  and  VacA,  alone  and  in  concert,  affect//,  pylori-  induced 
disease.  However,  the  reason  why  only  a  portion  of  the  population  develops  gastric 
cancer  still  remains  unclear. 
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Chapter  One 


Introduction 

Helicobacter  pylori 

Helicobacter  pylori  is  a  bacterium  whose  discovery  has  radically  impacted  the 
medical  field.  While  reports  of  spiral  shaped  bacteria  associated  with  the  gastric  mucosa 
were  made  multiple  times  since  the  1870s,  the  presence  of  these  bacteria  was  considered 
a  contaminate  of  the  process  (as  reviewed  in  126).  The  idea  that  there  was  a  pathogenic 
species  of  bacteria  that  could  live  within  the  hostile,  and  at  the  time  presumed  sterile, 
environment  of  the  stomach  was  inconceivable.  However,  H.  pylori  was  cultured  in  the 
early  1980s  by  the  accidental  incubation  of  plates  for  five  days  instead  of  the  usual  three 
days  typically  used  to  culture  Campylobacter  species  (219).  Originally,  the  organism 
was  classified  as  Campylobacter pyloridis,  but  after  subsequent  study  was  reclassified  to 
Helicobacter  pylori  in  1989  (1).  Currently,  there  are  18  identified  species  within  the 
Helicobacter  genus.  These  species  infect  the  gastric  mucosa,  intestinal  tract,  or 
hepatobilliary  tract  of  various  mammals,  which  range  from  rodents,  to  domestic  dogs  and 
cats,  to  fann  animals  (cattle  and  swine),  to  some  non-human  primates  and  humans  (81, 
85,  264,  330).  H.  pylori  itself  is  a  human  and  non-human  primate  specific  organism  (81, 
85,  150),  which  is  believed  to  have  co-evolved  with  humans  for  at  least  the  last  50,000 
years  (23,  66,  393). 

H.  pylori  is  a  small,  Gram  negative,  spiral  shaped,  microaerobe  that  has  multiple 
(four  to  six)  polar,  sheathed  flagellae  that  are  responsible  for  its  corkscrew  motility  ( 1 , 
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120,  127,  219).  The  H.  pylori  genome  is  approximately  1.6  Mb,  contains  approximately 
1,600  open  reading  frames  and  is  A/T  rich  (60%;  12,  32,  124,  257,  336,  360). 
Approximately  half  of  the  characterized  strains  contain  plasmids  (281).  H.  pylori  strains 
are  naturally  competent,  which  allows  for  the  constant  exchange  of  DNA.  Given  this  and 
the  fact  that  multiple  strains  can  infect  a  single  host  (110,  122,  158,  164),  it  appears  that 
H.  pylori' %  ability  to  take  up  new  DNA  in  vivo  acts  as  a  mechanism  of  evolution.  Indeed, 
analysis  of  an  archived  reference  strain  (J99)  and  isolates  from  the  same  patient  taken  six 
years  apart  shows  high  levels  of  genetic  diversity  (158).  Collectively,  the  thirty  new 
isolates  from  the  original  patient  infected  with  J99  had  lost  up  to  2.3%  of  the  open 
reading  frames  compared  to  the  archived  J99  strain  (158).  Furthennore,  the  new  isolates 
also  contained  additional  DNA,  which  showed  homology  to  genes  from  other  H.  pylori 
strains,  that  were  not  found  in  the  original  J99  strain.  This  DNA  may  have  been  acquired 
from  a  transient  co-infection  with  a  different  H.  pylori  strain  or  another  closely  related 
bacterium  (158).  Overall,  natural  competence  has  been  proposed  to  contribute  to  the  vast 
allelic  diversity  of  the  organism,  and  to  help  account  for  the  considerable  genetic 
variability  (6-7%)  between  strains  (13,  124,  205,  352). 

H.  pylori  is  extremely  well  adapted  for  colonization  of  the  human  gastric  mucosa 
thanks  to  a  variety  of  colonization/virulence  factors.  For  example,  given  that  it  is  a 
neutrophile,  one  of  the  most  daunting  innate  host  defenses  that  this  organism  has  to 
overcome  is  the  extremely  low  pH  of  the  stomach.  To  this  end,  H.  pylori  encodes  a 
urease  that  appears  to  be  the  key  factor  in  this  process  in  vivo.  Urease  hydrolyzes  urea  to 
create  ammonia,  and  the  basic  ammonia  molecule  in  turn  buffers  the  bacterial  cytoplasm, 
as  well  as  the  surrounding  micro-environment  (86,  87,  23 1).  The  importance  of  this 
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process  is  evidenced  by  two  different  studies  that  identified  genes  necessary  for 
colonization;  multiple  genes  within  the  urease  operon,  as  well  as  genes  that  encode 
components  required  to  obtain  and  integrate  nickel  into  the  urease  complex  were  found 
(31,  166).  Since  H.  pylori  colonizes  within  the  mucus  layer  overlaying  the  gastric 
epithelium,  the  bacteria  require  flagellar  motility  in  order  to  move  through  the  gastric 
lumen  to  the  proper  site  of  colonization  (88).  Indeed,  the  largest  number  of  characterized 
genes  found  to  be  important  for  animal  colonization  are  genes  involved  in  motility  or 
chemotaxis  (31,  166).  Additionally,  II.  pylori  also  produces  a  mucinase  that  helps  the 
bacteria  penetrate  the  mucus  layer  to  reach  the  epithelium  (328).  While  a  majority  of  the 
bacteria  do  live  within  the  mucus  layer,  a  small  percentage  actually  adhere  to  the  host 
epithelium.  Adherence  occurs  through  the  use  of  adhesins  such  as  the  Lewis  b-binding 
adhesion  (BabA;  46,  155,  288)  and  the  sialic  acid-binding  adhesion  (SabA;  203).  Other 
genes  identified  as  essential  for  animal  colonization  encode  outer  membrane  proteins, 
transporters,  and  factors  important  for  cellular  maintenance  (such  as  those  that  control 
energy  production,  DNA  modification,  transcriptional  regulation,  iron  metabolism,  and 
cell  division;  31,  166). 

In  addition  to  these  colonization  factors,  H.  pylori  also  has  numerous  virulence 
factors  that  affect  the  host  cell  directly.  Two  of  those  virulence  factors,  CagA  and  VacA, 
affect  a  multitude  of  host  cellular  pathways  and  are  discussed  in  detail  later  in  this  thesis 
(29,  141-143,  215,  236,  267,  283,  308,  354,  355,  366).  In  addition  to  these  two  toxins,  H. 
pylori  encodes  factors  that  affect  inflammation.  These  include  the  N-terminus  of  the 
large  subunit  of  the  urease  protein  (204)  and  the  neutrophil  activating  protein  (NapA;  96). 
NapA  was  first  identified  as  a  bioferritin  through  amino  acid  sequence  analysis  (95); 
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however,  it  was  later  shown  to  be  chemotactic  for  neutrophils  and  increases  neutrophil 
adhesion  to  endothelial  cells  (96).  Additionally,  expression  of  the  proinflammatory  outer 
membrane  protein  (OipA)  has  been  shown  to  be  associated  with  severe  neutrophil 
infiltration,  gastric  inflammation,  and  high  colonization  loads  (395,  396).  OipA  seems  to 
co-vary  with  other  virulence  factors,  including  cagA  and  vacA  (395,  396).  Other 
virulence  factors  that  contribute  to  inflammation  are  the  superoxide  dismutase  (SodB; 
317)  and  the  type  IV  secretion  system  encoded  on  the  cag  pathogencity  island  (71).  En 
masse  all  of  these  virulence  factors,  or  particular  combinations  of  these  virulence  factors, 
allow  H.  pylori  to  be  an  effective  pathogen  despite  its  small  genome  and  chosen  niche. 

Epidemiology  and  Disease 

Worldwide,  H.  pylori  infection  is  acquired  by  more  than  50%  of  children  by  the 
age  of  10  (285).  Conversely,  the  acquisition  of  new  infections  in  adults  is  significantly 
lower  (174).  Overall  prevalence  is  approximately  50%  of  the  world’s  population,  though 
infection  rates  vary  from  7-87%  (3,  84,  104,  210,  225,  273,  345,  358,  403).  The  highest 
rates  of  infection  are  among  developing  or  more  recently  developed  countries,  as  well  as 
countries  in  East  Asia,  such  as  South  Korea  (3,  225,  273).  Interestingly,  not  only  do 
prevalence  rates  vary  between  countries,  but  also  within  a  country  (182,  269).  There 
appears  to  be  an  inverse  relationship  between  socioeconomic  status  and  infection  with  H. 
pylori  (129,  208,  210-212,  326).  Furthermore,  in  the  U.S.  race  was  also  found  to 
influence  infection  rates.  Minorities,  especially  African  Americans,  are  more  likely  to  be 
infected  than  Caucasians  and  are  more  likely  to  maintain  the  infection  (101,  129,  209). 
While  it  is  generally  believed  that  H.  pylori  infection  lasts  the  lifetime  of  the  host  without 
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treatment,  spontaneous  clearance  and  transient  infections  have  been  reported  (22,  130, 
176,  209,  298).  Even  in  treated  individuals,  recurrence  of  infection,  particularly  with  new 
strains,  is  frequent  in  countries  where  overall  H.  pylori  prevalence  is  high  (123,  253). 
Reinfection  is  likely  due  to  the  fact  that  infection  fails  to  induce  a  protective  immunity. 
Recently,  however,  the  global  infection  rate  for  H.  pylori  is  slowly  declining  (33). 

Transmission  of  H.  pylori  appears  to  occur  via  person  to  person  spread,  and  many 
studies  have  correlated  higher  rates  of  H.  pylori  infection  to  living  in  close  quarters:  for 
instance,  an  institutionalized  setting  (44,  135,  171,  182,  190).  In  fact,  infection  rates  were 
increased  by  as  much  as  52%  in  an  institutionalized  population  as  compared  to  gender 
and  age  matched  controls  (182).  In  support  of  increased  person  to  person  contact  being  a 
risk  factor  for  infection,  another  study  found  increased  rates  of  II.  pylori  infection  in 
children  living  in  communal  apartments  as  compared  to  children  with  traditional  families 
(212).  Transmission  is  also  thought  to  be  familial;  children  born  to  an  infected  parent  are 
more  likely  to  be  colonized  and  the  colonizing  strain  is  the  same  strain  carried  by  the 
infected  parent  (173,  206,  300,  357).  Other  studies  have  looked  at  this  trend  from  the 
other  side  and  found  that  in  families  with  H.  pylori  infected  children,  there  were 
increased  numbers  of  family  members  infected  as  compared  to  families  with  uninfected 
children  (82,  228).  Transmission  has  also  been  documented  between  spouses,  and  the 
risk  of  infection  increased  the  longer  the  spouse  lived  with  their  infected  partner  (5 1 , 

325).  These  studies  suggest  that  H.  pylori  is  in  fact  spread  person  to  person. 

H.  pylori  infection  is  specific  to  humans  and  non-human  primates,  and  although 
we  do  not  know  the  exact  vehicle  of  human  transmission,  it  is  believed  to  be  spread 
through  either  gastro-oral,  oral-oral,  or  fecal-oral  routes  of  infection.  Although  rare,  there 
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is  some  evidence  that  //.  pylori  can  survive  in  a  culturable  form  in  milk  and  tap  water  for 
a  few  days  (98,  153),  and  longer  in  a  nonculturable  coccoid  form  (373).  Gastro-oral 
transmission  of  H.  pylori  has  been  postulated  and  is  a  risk  factor  among  institutionalized 
adults  in  the  Netherlands  (44).  In  fact,  H.  pylori  was  detected  in  the  vomitus  of  100%  of 
adult  subjects  and  could  be  detected  and  cultured  approximately  40%  of  the  time  from  up 
to  0.3  meters  away  (277).  Vomiting  was  also  shown  to  be  a  risk  factor  for  infection  in 
children  (201,  282).  Another  possible  method  of  transmission  is  oral-oral  spread.  This 
method  has  been  proposed  due  to  studies  in  which  identical  H.  pylori  strains  were 
isolated  from  the  stomach  and  dental  plaque  (56,  99,  318).  However,  the  rate  of  isolation 
of  H.  pylori  from  dental  plaque  is  extremely  low  (39,  54,  56,  99,  318),  and  PCR  detection 
rates  in  dental  plaque  vary  (224).  Interestingly,  dental  workers  do  not  have  an  increased 
rate  of  infection  (194,  207,  249).  Another  possible  means  of  transmission  is  the  fecal- 
oral  route.  Studies  have  been  able  to  identify  H.  pylori  DNA  in  feces,  but  the  rates  of 
detection  vary  from  less  than  10%  (245)  up  to  90%  (191,216).  While  there  have  been 
studies  that  successfully  cultured  the  organism  from  feces  (168,  356),  these  studies 
looked  at  a  population  that  was  malnourished,  where  the  bacteria  would  have  experienced 
a  shorter  transient  time  in  the  intestine.  In  fact,  another  study  verified  that,  by  inducing  a 
shorter  time  in  the  intestine  via  use  of  a  diuretic,  the  ability  to  culture  H.  pylori  from 
feces  was  increased  (277).  However,  controversy  still  surrounds  this  method  of 
transmission  because  H.  pylori  is  sensitive  to  bile  (134).  //.  pylori  was  also  found  not  to 
be  associated  with  other  diseases  that  are  transmitted  fecal-orally,  such  as  hepatitis  A 
(113,  140,  200,  303,  380).  Thus,  overall  scientists  still  do  not  know  how  H.  pylori  is 
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transmitted.  Answering  this  question  would  go  a  long  way  to  understanding  H.  pylori 
epidemiology  and  induced  disease  etiology. 

H.  pylori  was  first  identified  within  areas  of  inflammation  and  was  suggested  to 
be  responsible  for  the  inflamed  tissue.  (219).  This  correlation  was  corroborated  with 
Marshall’s  famous  experiment,  where  upon  consumption  of  an  H.  pylori  culture,  he 
developed  gastritis  (217).  As  mentioned  above,  the  discovery  of  H.  pylori  was  a 
controversial  one,  because  people  believed  that  bacteria  could  not  survive  the  harsh 
environment  of  the  stomach,  let  alone  be  the  causative  agent  of  gastric  disease  (218). 
Presently,  it  is  now  readily  accepted  that  H.  pylori  is  the  etiologic  agent  of  acute  and 
chronic  gastritis,  peptic  ulcer  disease  (75%  of  gastric  and  90%  of  duodenal  ulcers),  and 
two  fonns  of  gastric  cancer  (MALT  lymphoma  and  gastric  adenocarcinoma;  40,  94,  275, 
276,  349).  Disease  progression  can  include  asymptomatic  infection  or  can  advance  from 
acute  to  chronic  gastritis  and  then  to  more  severe  forms  of  disease  (335).  If  the 
inflammation  is  predominant  in  the  antral  end  of  the  stomach,  then  gastritis  may  progress 
to  duodenal  ulcers,  whereas,  nonatrophic  pangastritis  can  progress  to  MALT  lymphoma, 
and  corpus  atrophic  gastritis  can  lead  to  gastric  ulcers,  and  eventually  cancer  via  the 
following  stages:  intestinal  metaplasia,  dysplasia,  and  gastric  adenocarcinoma  (335). 
Deaths  due  to  gastric  cancer  are  still  the  second  leading  cause  of  cancer-related  deaths 
worldwide  (74,  248,  272,  392).  For  all  of  these  reasons,  the  World  Health  Organization 
has  classified  H.  pylori  as  a  Class  I  carcinogen,  and  it  is  currently  the  only  bacterium  with 
this  designation  (4). 

While  there  is  not  a  gender  bias  among  H.  pylori  infection  rates,  there  is  a  bias 
among  disease  distributions.  Females  are  more  likely  to  suffer  from  gastritis,  while 
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males  are  more  likely  to  be  afflicted  with  more  severe  disease:  both  ulcers  and  cancer 
(302).  In  fact,  men  are  1.5  to  2.5  times  more  likely  than  women  to  suffer  from  gastric 
cancer  (reviewed  in  297).  While,  overall  gastric  cancer  rates  vary  drastically  worldwide, 
it  may  not  be  surprising  that  gastric  cancer  rates  are  highest  where  77.  pylori  colonization 
rates  are  highest  (74).  For  instance,  South  Korea,  which  has  one  of  the  highest  rates  of 
colonization,  also  has  one  of  the  highest  rates  of  gastric  cancer  (42,  60,  133,  210,  323, 
358,  403).  In  fact,  56.2%  of  the  infection-related  cancers  in  South  Korea  are  due  to  77. 
pylori,  and  45.1%  of  the  cancer  deaths  due  to  infectious  diseases  are  associated  with  77. 
pylori  infection  (322).  This  amazing  statistic  exemplifies  the  burden  of  77.  /?v/on'-induced 
disease. 

While  many  insights  into  77.  /w/ort-induced  disease  have  been  gained,  the  reason 
why  some  individuals  develop  more  severe  disease  remains  elusive.  The  epidemiologic 
triangle  looks  at  disease  as  a  contribution  of  three  attributes:  agent  (in  this  case  77.  pylori ), 
environment,  and  host.  The  first  arm  of  the  triangle  is  the  agent,  specifically  bacterial 
factors  that  affect  disease.  Recent  studies  have  focused  on  the  role  of  the  77.  pylori 
toxins,  CagA  and  VacA  in  disease  development.  Both  of  these  toxins  are  polymorphic 
within  different  77.  pylori  strains  and  particular  polymorphisms  have  been  correlated  to 
the  development  of  particular  disease  states.  The  more  virulent  forms  of  these  factors 
also  correlate  with  the  geographical  areas  with  the  highest  rates  of  the  most  severe  forms 
of  disease,  specifically  gastric  cancer.  In  fact,  over  90%  of  South  Korean  isolates 
contain  the  most  virulent  form  of  CagA,  which  has  been  shown  to  impact  gastric  cancer 
risk  (42,  60).  Additionally,  OipA  expression  was  proposed  to  be  a  factor  that  affects 
disease  progression  (395).  Although  its  role  as  a  novel  marker  for  certain  disease  types  is 
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uncertain  (17,  18,  125,  196),  the  duodenal  ulcer-promoting  gene  ( dupA )  is  associated  with 
increased  inflammation,  but  has  an  inverse  relationship  for  gastric  atrophy  and  cancer 
(17,  196,  287).  Furthermore,  the  Helicobacter  outer  membrane  protein  B  (HomB) 
impacts  disease  development  (71,  163,  261).  homB  presence  is  associated  with  peptic 
ulcers  and  is  a  discriminative  factor  between  gastric  cancer  and  duodenal  ulcers  (163, 

260,  261).  The  second  arm  of  the  epidemiological  triangle  focuses  on  the  environment  in 
which  afflicted  people  live.  We  have  already  discussed  that  low  socio-economic  status, 
as  well  as  proximity  to  other  people,  impact  infection  rates.  However,  other 
environmental  factors  play  a  role  in  disease  development.  These  include  co-infection 
with  parasites  and  cigarette  smoking.  It  has  been  shown  that  infection  with  helminths  can 
actually  diminish  disease  severity  (83,  106,  382),  whereas  cigarette  smoking  increases 
gastric  cancer  rates  (321,  324,  332).  The  final  ann  of  the  epidemiological  triangle  is  the 
host.  Host  factors  include  both  host  genetics  and  diet.  Different  host  genetic  mutations 
can  increase  the  likelihood  of  development  of  gastric  cancer:  IL-ip  (91,  92,  1 12,  202) 
and  PTPN1 1,  the  gene  that  encodes  for  SHP-2  (37,  149).  Also  included  in  the  host  arm 
are  dietary  factors,  which  impact  disease  development  by  affecting  bacterial  factors  or 
changing  the  host  environment.  In  fact,  salt  intake  is  implicated  as  important  for  H. 
pylori- induced  gastric  cancer  development  (107,  146,  364).  Conversely,  consumption  of 
fruits  and  vegetables,  which  contain  vitamin  C,  P  carotene,  and  antioxidants  can  actually 
decrease  the  risk  of  gastric  cancer  (89,  93,  331,  387).  As  scientists  and  doctors  are  well 
aware,  the  process  of  disease  development  is  complicated  and  involves  not  only  the 
bacteria,  but  also  environmental  and  host  factors. 
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Treatment  and  Vaccines 1 

During  the  relatively  short  period  of  time  that  we  have  known  about  H.  pylori, 
there  have  been  many  different  treatment  regimens  developed  (reviewed  in  55).  In  fact, 
in  1994  there  was  a  consensus  from  the  National  Institutes  of  Health  (USA;  5),  followed 
two  years  later  by  the  Maastricht  Consensus  from  the  European  Helicobacter  Study 
Group  (Netherlands;  2),  which  established  treatment  recommendations  to  treat  H.  pylori 
infection.  Given  the  increase  in  incidence  of  antibiotic  resistance,  the  Maastricht 
Consensus  report  was  updated  in  2000  and  again  in  2005  to  increase  the  effectiveness  of 
treatment  regimens  against//,  pylori  (213,  214). 

The  current  recommendation  for  first  line  therapy  in  locations  where 
clarithromycin  resistance  is  low,  is  a  protein  pump  inhibitor,  clarithromycin,  and  either 
metronidazole  (first  choice)  or  amoxicillin  (second  choice)  for  14  days  (213). 
Additionally,  these  triple  therapy  regimens  can  be  supplemented  by  the  addition  of 
bismuth  in  geographical  areas  where  antibiotic  resistance  is  high,  though  this 
combination  is  typically  recommended  as  a  second  line  therapy  (213).  Moreover,  since 
bismuth  is  not  available  in  many  countries,  a  combination  of  a  proton  pump  inhibitor, 
metronidazole,  and  either  amoxicillin  or  tetracycline  is  sometimes  recommended  (213). 

Primary  and  secondary  therapies  are  not  always  successful  at  eradicating  H. 
pylori',  therefore,  many  alternative  drugs  are  proposed  for  rescue  therapy.  These  include 
fluoroquinolones  (such  as  levofloxacin),  rifamycins  (such  as  rifabutin  and  rifampicin), 
nitrofurans  (such  as  furazolidone)  and  other  members  (such  as  doxycycline)  within 

1  Excerpts  taken  from  the  review  article:  K.R.  Jones,  J.H.  Cha,  and  D.S.  Merrell.  2008.  Who’s 
Winning  the  War?  Molecular  Mechanisms  of  Antibiotic  Resistance  in  Helicobacter  pylori.  Current  Drug 
Therapies.  3:  190-203. 
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families  that  are  already  used  to  treat  //.  pylori  infection  (reviewed  in  55,  64).  Of  note, 
resistance  has  been  found  to  all  utilized  primary  and  secondary  antibiotics,  as  well  as,  to 
many  of  the  antimicrobials  used  for  rescue  therapy  (Fig.  1).  This  fact  suggests  that 
therapy  success  rates  will  continue  to  decline,  and  indicates  that  a  detailed  understanding 
of  antibiotic  resistance  mechanisms  may  facilitate  development  of  novel  therapeutics. 

Arguably  the  best  option  would  be  the  production  of  a  H.  pylori  vaccine,  and 
several  possible  vaccine  candidates  are  being  researched  (238,  327,  407).  Vaccine 
components  vary  and  include  killed  H.  pylori  whole  cell  extracts  (289),  heat  shock 
proteins  (100),  flagellar  antigens  (327),  adhesion  antigens  (340),  lipopolysaccaride 
antigens  (105),  neutrophil  activating  protein  (304),  and  urease  (35).  Unfortunately  many 
of  these  vaccines  are  a  long  way  from  human  trials  (238,  327,  407),  and  the  inactivated 
whole  cell  extract  was  proven  ineffective  in  a  human  volunteer  study  (177). 

Virulence  Factors  that  Impact  Host  Cell  Pathways 2 

Due  to  H.  pylori'  s  association  with  a  variety  of  severe  gastric  diseases,  many 
studies  have  been  conducted  to  elucidate  the  bacterial,  host,  and  environmental  factors 
that  impact  disease  progression.  To  date,  several  bacterial  virulence  factors  have  been 
associated  with  gastric  cancer.  For  instance,  the  outer  membrane  proteins,  HomB  (163, 
260)  and  BabA  in  Western  type  CagA  containing  strains  (121),  but  not  in  East  Asian  type 
CagA  containing  strains  (230)  have  been  associated  with  progression  to  gastric  cancer, 
while  OipA  (36,  254)  and  DupA  (80,  154,  196,  251,  306)  have  more  controversial  roles 


2Excerpts  taken  from  the  review  article:  K.R.  Jones,  Jeannette  Whitmire,  D.S.  Merrell.  A  tale  of 
two  toxins:  Helicobacter  pylori  CagA  and  VacA  modulate  host  pathways  that  impact  disease.  Frontiers  in 
Cellular  and  Infection  Microbiology,  doi:  10.3389/fmicb.2010.001 15 
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Figure  1:  Cellular  components  targeted  by  commonly  used  antibiotics  and  mechanisms 
of  resistance  utilized  by  H.  pylori.  The  upper  portion  of  the  figure  depicts  normal  drug 
interactions,  while  the  lower  portion  depicts  resistance  mechanisms  currently  identified  in 
H.  pylori.  *  Denotes  a  mutation  in  the  specified  gene.  A.  (3-lactams  prevent  the 
completion  of  the  peptidoglycan  layer  of  H.  pylori  through  their  interaction  with 
penicillin  binding  proteins  (PBP).  Resistant  bacteria  contain  either  a  mutated  PBP,  which 
prevents  interaction  with  the  (3-lactams,  or  mutations  in  hopB  or  hopC,  which  decrease 
accumulation  of  the  (3-lactam  within  the  bacterial  cell.  B.  Nitrofurans  and 
nitroimidizoles  act  in  a  similar  manner  against  bacterial  DNA.  Both  pro-drugs  enter  the 
cell  and  must  be  reduced  by  nitroreductases  to  become  active.  The  activated  form  leads 
to  fonnation  of  radicals  that  damage  DNA.  In  resistant  bacteria  one  or  more  of  these 
nitroreductases  are  inactivated.  The  existence  of  a  TolC  efflux  pump  has  also  been 
identified  as  a  mechanism  of  resistance  to  nitroimidizoles.  C.  Fluoroquinolones  act  upon 
gyrases,  which  are  enzymes  responsible  for  the  conversion  of  DNA  into  a  relaxed  state 
required  for  DNA  replication.  Bacteria  containing  mutations  in  these  gyrases  are 
resistant  to  fluoroquinolones.  D.  Rifamycins  act  by  blocking  a  subunit  of  the  DNA- 
dependent  RNA  polymerase  thereby  terminating  the  production  of  mRNA.  Resistant 
bacteria  contain  mutations  within  this  subunit,  which  is  encoded  by  the  rpoB  gene.  E. 
Tetracyclines  and  macrolides  both  prevent  the  completion  of  translation,  thereby 
preventing  protein  production.  Mutations  within  specific  ribosomal  subunits  cause  H. 
pylori  to  be  resistant  to  these  drugs.  Also,  the  HefABC  efflux  pump  has  been  shown  to 
remove  tetracycline  and  some  macrolides  from  the  bacterial  cell. 
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Figure  1:  Cellular  components  targeted  by  commonly  used  antibiotics  and  mechanisms 
of  resistance  utilized  by  H.  pylori 
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in  disease  development.  The  effectors,  CagA  and  VacA,  have  also  been  shown  to 
influence  disease  state,  and  are  probably  the  most  well  studied  virulence  factors  of  H. 
pylori.  These  toxins  have  been  shown  to  have  multiple  effects  on  host  cells,  as  well  as  to 
modulate  multiple  cellular  pathways  in  what  appears  to  be  a  complex  orchestration  that 
ultimately  leads  to  disease.  To  begin  to  shed  some  light  on  these  pathways,  as  well  as  on 
the  etiology  of  disease,  this  section  will  highlight  some  major  findings  regarding  CagA, 
VacA,  and  their  specific  effects  on  host  cells.  Due  to  the  large  amount  of  literature  on 
this  subject  and  space  limitations,  an  exhaustive  review  is  not  provided.  However,  we 
encourage  readers  to  explore  the  excellent  reviews  by  Cover  and  Blanke  (67),  Rieder  et 
al.  (294),  and  Hatakeyama  and  Higashi  (139). 

Cytotoxin-associated  gene  A  -  CagA 

CagA  is  arguably  the  most  well  studied  virulence  factor  of  H.  pylori.  It  is 
encoded  on  the  cag  pathogenicity  island,  which  is  a  horizontally  acquired  40  Kb  DNA 
segment  that  encodes  for  a  type  IV  secretion  system,  and  is  the  only  known  effector 
protein  to  be  injected  into  host  cells  (9,  57).  cagA  is  the  last  gene  on  the  cag 
pathogenicity  island,  and  encodes  for  the  120-145  kDa  immunodominant  CagA  protein 
(65,  367).  Since  its  discovery,  CagA  has  been  shown  to  impact  disease,  especially  more 
severe  disease  states  like  gastric  cancer  (42,  152,  178,  274,  388).  cagA  is  present  in 
-70%  of  strains  worldwide,  but  this  rate  varies  geographically  from  between  90-95%  in 
East  Asian  countries  (South  Korea,  China,  Japan)  to  only  about  40%  in  Western  countries 
(Australia,  United  States  of  America,  England;  137). 
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Once  injected  into  host  cells,  CagA  can  act  directly  in  an  unphosphorylated  state 
to  influence  cellular  tight  junction  (16,  30,  240,  262),  cellular  polarity  (301,  406),  cell 
proliferation  and  differentiation  (58,  184,  227,  240),  cell  scattering  (227),  induction  of  the 
inflammatory  response  (49),  and  perhaps  cellular  elongation  (Fig.  2;  301,  370). 

Moreover,  upon  entering  the  eukaryotic  cell,  CagA  localizes  to  the  plasma  membrane 
where  it  can  be  phosphorylated  by  either  Abl  kinase  or  Src  family  kinases  (284,  311,  333, 
334,  350).  These  kinases  phosphorylate  tyrosine  residues  found  in  a  five  amino  acid 
repeat,  Glu-Pro-Ile-Tyr-Ala  (EPIYA),  within  the  carboxy-tenninus  of  CagA  (142,  143). 
These  repeats  can  be  categorized  based  on  the  amino  acid  sequences  found  within  the 
regions  flanking  the  EPIYA  sequence  to  yield  four  distinct  EPIYA  motifs,  which  are 
known  as  EPIYA-A,  -B,  -C,  and  -D.  Two  combinations  of  these  motifs  predominate: 
Western  CagA,  which  contains  EPIYA-A,  -B,  and  -C  motifs  (strains  have  been 
genotyped  that  contain  up  to  five  -C  motifs)  and  East  Asian  CagA,  which  contains 
EPIYA-A,  -B,  and  -D  motifs  (Fig.  2;  19,  65,  142,  143,  242,  250,  333).  Additionally, 
there  is  a  multimerization  motif  that  consists  of  a  16  amino  acid  sequence  present  within 
the  EPIYA  repeat  region  (291).  Once  phosphorylated,  CagA  can  form  a  complex  with 
the  CT10  regulator  of  kinase  (Crk)  adaptor  protein  (50,  344),  Abl  kinase  and  a  splice 
variant  of  Crk,  Crkll  (350),  or  the  Src  homology  2  phosphatase  (SHP-2;  143).  Each  of 
these  interactions  influences  cellular  shape  and  motility  (50,  143,  344,  350).  CagA  that  is 
phosphorylated  at  the  primary  phosphorylation  sites  (EPIYA-C  and  -D)  shows  varying 
affinities  for  SHP-2  based  on  the  particular  EPIYA  variant  as  well  as  subsequent 
differential  effects  on  the  pathways  influenced  by  the  phosphorylated  CagA/SHP-2 
complex  (142,  143). 
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Figure  2:  CagA  and  known  host  cell  targets.  A.  A  schematic  representation  of  CagA 
with  the  polymorphic  region  containing  different  EPIYA  motif  (A,  B,  C,  and  D) 
combinations  is  shown  and  is  adapted  from  Hatakeyama  and  Higashi  (139).  B.  A 
graphic  depiction  of  the  gastric  mucosa  and  known  host  pathways  impacted  by 
phosphorylated  and  nonphosphorylated  CagA  is  shown.  Pathways  targeted  in  epithelial 
cells  and  B  cells  are  indicated.  The  actin  binding  proteins  (ABP)  affected  by  CagA 
include  vinculin,  cortactin,  and  ezrin.  This  figure  was  adapted  from  an  earlier  version  by 
Rieder,  et  al.  (294). 
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Figure  2:  CagA  and  known  host  cell  targets 
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CagA  and  Disease 

The  mere  presence  of  CagA  is  associated  with  more  severe  disease  forms  (42,  65, 
72,  133).  In  fact,  cancer  patients  are  at  least  twice  as  likely  to  be  infected  with  an  H. 
pylori  strain  that  is  cagA  positive  than  one  that  is  cagA  negative  (42,  133).  It  has 
additionally  been  demonstrated  in  vivo  that  cagA  plays  an  important  role  in  disease 
progression  in  a  Mongolian  gerbil  model  where  gastric  cancer  develops  within  12  weeks 
(108,  109,  255,  280,  295,  379,  386).  The  differences  in  affinity  of  various  EPIYA  motifs 
for  SHP-2  and  subsequent  differences  in  induction  levels  of  downstream  pathways  has 
been  speculated  to  impact  the  differences  in  disease  rates,  especially  gastric  cancer  (138, 
142,  143).  In  fact,  increasing  numbers  of  EPIYA-C  motifs  have  been  suggested  to  be 
associated  with  cancer  development  (26,  305),  and  epidemiological  studies  have 
identified  a  correlation  between  increased  number  of  -C  motifs  and  heightened  disease 
severity  (20,  142,  394).  Since  EPIYA-D  has  the  strongest  affinity  for  SHP-2  (142),  it  is 
not  surprising  that  East  Asian  CagA  produces  more  inflammation  and  atrophy  (27)  as 
well  as  greater  morphological  changes  in  infected  cells  (142).  Moreover,  the  variability 
in  CagA  is  important  when  analyzing  the  geographic  areas  with  the  highest  gastric  cancer 
rates;  these  areas  not  only  have  the  highest  colonization  rates,  but  also  contain  the  highest 
percentage  of  H.  pylori  strains  that  carry  the  cagA  gene,  in  particular  the  EPIYA-ABD 
allele  (7,  74,  137,  392).  Indeed,  we  identified  an  association  between  gastric  cancer 
development  and  EPIYA-ABD  CagA  through  a  large  scale  molecular  epidemiological 
study  of  strains  from  South  Korea  (162). 
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CagA  Phosphorylation  Independent  Events 
Physical  Effects  on  Host  Cells 

Despite  the  importance  of  CagA  phosphorylation,  CagA  has  many  effects  on  the 
host  cell,  and  some  of  these  effects  are  accomplished  in  a  phosphorylation  independent 
manner.  One  of  the  most  noticeable  CagA-dependent  effects  on  host  cells  is  the 
disruption  of  tight  junctions  and  induction  of  changes  in  cell  morphology.  CagA  has 
been  shown  to  affect  cellular  tight  junctions  in  a  phosphorylation-independent  manner 
(16,  30,  240,  262),  and  has  been  shown  to  be  important  for  the  recruitment  of  the 
junctional  adhesion  molecule  (JAM)  and  the  tight  junction  protein,  zona  occludens-1 
(ZO-1)  to  points  of  bacterial  contact  (16,  30).  Murata-Kamiya,  et  al.  showed  by 
immunoprecipitation  that  E-cadherin  physically  interacts  with  both  wild  type  and 
phosphorylation  resistant  variants  of  CagA,  and  that  this  interaction  inhibits  the 
association  of  E-cadherin  with  P-catenin,  which  subsequently  results  in  the  accumulation 
of  nuclear  and  cytoplasmic  P-catenin  (240). 

Additionally,  it  has  been  demonstrated  that  CagA  binds  to  and  prevents  the  kinase 
activity  of  the  partitioning-defective  1 /microtubule  affinity-regulating  kinase 
(Parlb/MARK2),  thereby  escalating  the  loss  of  tight  junctions  and  polarity  (301,  406). 
CagA  binds  to  Par  lb,  as  well  as  other  members  of  this  kinase  family,  through  the 
multimerization  sequence  (198,  301),  specifically  14  of  the  16  amino  acids  of  the 
multimerization  motif  are  required  (FPLKRHDKVDDLSK;  247).  This  interaction  has 
been  suggested  to  contribute  to  host  cell  elongation  (301,  370).  East  Asian  CagA  binds 
Parlb  with  a  stronger  affinity  than  CagA  from  Western  strains,  and  the  efficiency  and 
strength  of  binding  to  Parlb  among  Western  strains  appears  proportional  to  the  number 
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of  Western  multimerization  sequences  (198).  In  addition  to  affects  on  cell  elongation  and 
disruption  of  cellular  junctions,  interaction  of  CagA  and  Parlb  also  causes  spindle 
dysfunction,  which  delays  progression  from  prophase  to  metaphase  and  is  hypothesized 
to  result  in  DNA  instability  (197). 

Moreover,  recent  evidence  has  shown  that  CagA  has  an  effect  on  how  invasive 
H.  pylori  can  be.  Though  considered  an  extracellular  pathogen,  numerous  studies  have 
shown  that  H.  pylori  is  able  to  invade  and  survive  inside  host  cells  (15,  256,  314).  Upon 
infection  with  cagA  positive  II.  pylori  strains,  a  multiprotein  complex  is  formed  through 
the  association  of  CagA,  c-Met,  E-cadherin  and  pl20-catenin,  and  this  complex 
influences  whether  the  bacteria  can  become  intracellular  (262).  When  this  complex  is 
present  in  a  cell  line  that  H.  pylori  can  normally  invade,  it  suppresses  the  ability  of  H. 
pylori  to  be  internalized  (262). 

Cellular  Differentiation 

Since  there  is  a  causal  link  between  gastric  cancer,  H.  pylori  infection,  and  the 
presence  of  cagA,  there  is  no  doubt  that  effects  on  host  cell  differentiation  and 
proliferation  are  important  for  ultimate  disease  progression.  In  keeping  with  this,  cagA 
positive  strains  of  H.  pylori  influence  a  factor  with  known  oncogenic  potential,  P-catenin 
(90,  108).  P-catenin  has  two  distinct  functions,  it  links  cadherins  with  the  actin 
cytoskeleton  and  is  part  of  the  WNT  signaling  pathway  (359).  When  unphosphorylated 
CagA  binds  E-cadherin,  it  prevents  the  formation  of  the  E-cadherin/p-catenin  complex, 
which  ultimately  leads  to  accumulation  of  P-catenin  in  both  the  nucleus  and  cytoplasm 
(240).  Kurashima,  et  al.  showed  that  while  phosphorylation  of  the  EPIYA  motifs  was  not 
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necessary  for  deregulation  of  P-catenin,  the  CagA  multimerization  sequence  was 
necessary  (179).  However,  this  process  is  likely  multifactorial  and  complex  since  some 
evidence  indicates  that  E-cadherin  dissociation  is  independent  of  CagA  and  that  the  E- 
cadherin/p-eatenin/pl20ctn  complex  is  not  affected  to  the  same  degree  in  all  studies  (381). 
When  the  E-cadherin/p-catenin  complex  is  disrupted,  cytoplasmic  P-catenin  is 
dephosphorylated  and  then  translocates  to  the  nucleus,  where  it  forms  heteromers  with 
other  transcription  factors  and  transcribes  a  number  of  genes  with  oncogenic  potential 
(193,  240).  In  fact  CagA  was  found  to  upregulate  the  P-catenin-dependent  cdxl  gene 
(193,  240),  which  encodes  a  transcription  factor  important  for  transdifferentiation  of 
intestinal  cells  (229),  as  well  as  to  affect  the  expression  of  goblet-cell  mucin  MUC2,  an 
intestinal-differentiation  marker  (240);  both  are  indicators  of  gastric  intestinal  metaplasia. 

Cell  Cycle,  Survival  and  Proliferation 

Increased  cellular  proliferation  is  one  indicator  of  cancer  that  has  been 
demonstrated  to  result  from  infection  with  cagA  positive  strains  of  II.  pylori  (278,  279). 
This  increased  proliferation  can  happen  through  CagA-mediated  activation  of  the  ERK/ 
MAPK  pathway.  CagA  activates  ERK  through  interaction  with  growth  factor  receptor 
bound  2  (Grb2),  and  appears  to  interact  with  both  phosphorylated  and  nonphosphorylated 
CagA  (227,  313).  However,  it  should  be  noted  that  the  phosphorylation  sequences 
themselves  are  essential  for  Grb2  binding  to  non-phosphorylated  CagA  (227),  which 
similar  to  the  strategy  for  binding  of  CagA  to  SHP-2  (192),  likely  occurs  through  CagA 
binding  to  the  Grb2  SH2  domains.  In  a  normal  cell,  upon  receiving  an  extracellular 
signal,  Grb2  binds  son  of  sevenless  (SoS),  which  increases  the  formation  of  the  Ras-GTP 
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complex  and  activates  the  Raf-^MEK^ERK  pathway.  This  pathway  increases 
activation  of  transcription  factors  involved  in  cell  proliferation  (114).  In  CagA 
intoxicated  cells,  CagA  can  bind  Grb2,  and  then  the  CagA/Grb2/SoS  complex  activates 
the  Ras->ERK  pathway  as  described  above  (227). 

Activity  of  the  serum  response  factor  and  serum  response  element  (SRF  and  SRE, 
respectively)  transcription  factors  are  also  increased  in  CagA  transfected  cells  in  a  CagA 
phosphorylation  independent  manner  (145).  Activation  of  SRE  appears  to  be  mediated 
by  increased  DNA-binding  by  the  E-26  like  protein-1  (Elkl;  145).  Evidence  of  the  role 
of  CagA  in  SRE/SRF  activation  can  be  found  through  CagA-mediated  increases  in  levels 
of  the  anti-apoptotic  protein  myeloid  cell  leukemia  sequence- 1  (MCL1),  which  acts  as  a 
pro-survival  factor  (226).  Furthermore,  the  tendency  of  CagA  to  affect  transcription 
factor  activity  seems  to  be  a  common  theme  since  the  signal  transducer  and  activator  of 
transcription  3  (STAT3)  pathway,  which  induces  cellular  proliferation  (175),  has  been 
shown  to  be  induced  in  vitro  and  in  vivo  in  a  CagA-dependent  but  phosphorylation- 
independent  manner  (52).  Moreover,  it  was  recently  confirmed  that  nonphosphorylated 
CagA  preferentially  activates  STAT3  (184).  The  activation  of  so  many  transcription 
factors  by  CagA  is  evidence  of  the  broad  influence  of  CagA  on  a  wide  variety  of  cellular 
functions. 

CagA  appears  to  influence  the  development  of  MALT  lymphoma  in  a 
phosphorylation  independent  manner;  CagA  can  inhibit  apoptosis  of  B-cells  through 
inhibition  of  the  accumulation  of  p53  due  to  decreased  p53  transcription  (371). 
Interestingly,  B-cell  survival  is  also  likely  due  in  part  to  an  increase  in  phosphorylated 
ERK1/2  (409),  which  when  moderately  increased  can  inhibit  apoptosis  and  promote 


23 


proliferation  (271).  However,  it  should  be  noted  that  transfection  with  CagA  also  leads  to 
phosphorylation  of  the  pro-apoptotic  protein.  Bad  (409).  Conversely,  translocation  of 
CagA  by  H.  pylori  results  in  upregulation  of  both  the  ERK  and  p38  pathways,  which  lead 
to  upregulation  of  the  pro-survival  proteins,  Bcl-2  and  Bcl-XL  (195).  Clearly,  CagA  has 
an  effect  on  cell  survival,  and  most  of  the  literature  suggests  that  CagA  can  inhibit 
apoptosis  of  B  cells,  which  likely  promotes  the  development  of  MALT  lymphoma. 

Conversely,  ectopically  expressed  CagA  can  also  suppress  cellular  proliferation  in 
IL-3 -dependent  B-lymphoid  cells  through  suppression  of  JAK-STAT  signaling  (371). 
Moreover,  transfection  of  AGS  cells  with  CagA  results  in  increased  expression  of  the 
pro-apoptotic  factor  p21 WAFI/Cipl  due  to  CagA-mediated  nuclear  translocation  of  the 
nuclear  factor  of  activated  T  cells  family  transcription  factor  (NFATc3;  404). 
Interestingly,  p21WAF1  expression  can  also  occur  as  a  direct  result  of  excessive  ERK1/2 
activation  (38,  270,  405).  Finally,  CagA-induced  deregulation  of  (3-catenin,  increases  the 
expression  of  Cyclin  D1  (369),  which  influences  progression  of  cells  from  G1  to  S  phase; 
thereby  promoting  cell  survival  in  a  CagA  phosphorylation-independent  manner  (58, 

240).  CagA  obviously  affects  cell  survival,  proliferation,  and  differentiation,  all  of  which 
can  affect  the  progression  of  disease,  including  development  of  gastric  cancer. 

Cell  Scattering 

In  order  for  cancer  cells  to  spread  or  metastasize,  they  must  detach  and  scatter  to  a 
new  area.  CagA  increases  cell  scattering  by  targeting  the  hepatocyte  growth  factor 
receptor  (c-Met),  which  acts  as  an  adaptor  molecule  for  proteins  like  Grb2,  phospholipase 
Cy  (PLCy),  and  STAT3  (63,  227).  CagA/Grb2/SoS-^Ras-GTP  complex-* 
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Raf->MEK->ERK  signaling  leads  not  only  to  an  increase  in  transcription  factors  that 
promote  cellular  proliferation,  but  also  to  an  increase  in  cell  scattering  (227).  In  support 
of  this,  cell  scattering  due  to  H.  pylori  infection  is  suppressed  by  blocking  PLCy  activity 
(63).  Though  once  again,  the  role  of  CagA  in  this  process  is  complex;  some  work 
suggests  no  association  between  enzymatic  activity  of  PLCy  and  the  cagA  status  of  H. 
pylori  strains  (43).  However,  it  is  clear  that  nonphosphorylated  CagA  interacts  with  c- 
Met  through  interaction  with  the  multimerization  domain  (343).  The  consequence  of  this 
interaction  is  activation  of  phosphatidylinositol  3-kinase  (PI3K)  signaling  through  Akt, 
which  subsequently  activates  NF-kB  and  P-catenin  (343).  It  has  been  suggested  that 
CagA  binds  to  c-Met  and  then  recruits  TNF  receptor  associated  factor  6  (TRAF6;  337) 
and  poly-ubiquitinated  transforming  growth  factor-P-activating  kinase  1(TAK1;  180, 

181).  TRAF6  then  activates  Akt  (399),  which  potentially  activates  NF-kB  and  RelA 
through  activation  of  the  IkB  kinase  (IKK)  complex  (343,  348).  CagA  modulates 
multiple  pathways  that  impact  cell  scattering,  and  these  pathways  that  are  activated 
appear  to  have  multiple  downstream  targets  that  can  affect  numerous  cellular  processes  in 
addition  to  cell  scattering. 

Inflammation 

A  hallmark  of  Helicobacter  pylori  infection  is  increased  and  chronic 
inflammation.  This  appears  to  occur  due  to  activation  of  NF-kB  and  persistent  induction 
of  IL-8.  While  it  remains  controversial  (73,  181,  307),  this  IL-8  induction  has  been 
shown  to  be  CagA  dependent  through  studies  that  ectopically  expressed  CagA  or  various 
CagA  EPIYA  motifs  (49,  170),  through  IL-8  promoter  reporter  assays  (319),  and  through 
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analysis  of  inflammation  and  NF-kB  activation  in  Mongolian  gerbils  infected  with  cagA 
positive  and  cagA  negative  II.  pylori  strains  (320).  Moreover,  Keates,  et  al.  showed  that 
IL-8  secretion  is  affected  by  the  activation  of  MAPKs  by  cagA  positive  H.  pylori  strains 
(167).  It  has  been  demonstrated  that  activation  of  NF-kB  and  induction  of  IL-8  occurs 
through  the  activation  of  the  Ras— >Raf-^Mek->ERK->  NF-kB  pathway  and  is 
independent  of  SHP-2  or  c-Met  (49).  Indeed,  Brandt,  et  al.  demonstrated  that  IL-8 
induction  was  CagA  phosphorylation-independent  (49).  Kim,  et  al.  confinned  that  NF- 
kB  activation  and  subsequent  induction  of  IL-8  were  due  to  activation  of  the  MAPK 
pathways  and  also  analyzed  the  role  of  the  different  EPIYA  motifs  by  analysis  of 
transfected  cells  with  CagA  constructs  that  differed  only  in  the  EPIYA  region.  Analysis 
of  Western  CagA-specific  sequences  and  East  Asian  CagA-specific  sequences  revealed 
that  the  levels  of  IL-8  induction  are  not  significantly  different  between  the  CagA  variants 
(170).  However,  it  should  be  noted  that  Argent,  et  al.  demonstrated  that  CagA-related 
differences  in  IL-8  induction  were  dependent  on  the  EPIYA  motifs  and  that  strains 
containing  East  Asian  CagA  induce  the  greatest  levels  of  IL-8  (19).  However,  they  did 
not  investigate  the  influence  of  phosphorylation  status  on  these  differences.  Since 
persistent  inflammation  is  a  hallmark  of  II.  pylori  infection  and  is  linked  to  more  severe 
diseases,  and  since  CagA  affects  the  inflammatory  process,  it  is  easy  to  envision  what  an 
important  role  CagA  plays  in  persistent  infection  and  disease  development. 
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CagA  Phosphorylation  Dependent  Events 
Targeting  of  SHP-2 

The  most  striking  H.  pylori  induced  morphological  change  to  host  cells  is  the 
induction  of  the  “hummingbird  phenotype,”  which  occurs  as  a  direct  result  of 
phosphorylated  CagA  complexing  with  SHP-2  and  subsequent  increased  ERK1/2 
activation.  Normally,  SHP-2  functions  to  increase  cellular  proliferation  and  motility  and 
is  activated  by  interacting  with  a  phosphorylated  Gab  protein  (246).  CagA  has  been 
shown  to  be  able  to  mimic  the  function  of  the  eukaryotic  Gab  protein  (47,  136).  Both  in 
vitro  and  in  vivo  (398),  CagA  forms  a  complex  with  SHP-2  after  phosphorylation  of  an 
EPIYA-C  or  -D  motif  (142).  The  formation  of  this  complex,  as  well  as  the  subsequent 
deregulation  of  SHP-2  as  a  means  of  CagA-mediated  effects  on  gastric  cancer,  is  of 
relevance  since  mutations  within  the  gene  encoding  for  SHP-2  ( PTPN11 )  have  been 
identified  in  multiple  forms  of  cancer  (37,  35 1).  Additionally,  there  is  an  increase  in  the 
risk  of  gastric  cancer  development  in  H.  pylori  infected  patients  with  certain  PTPN11 
polymorphisms  (128).  Again,  this  demonstrates  the  potential  of  CagA  to  impact  disease 
progression. 

Activation  of  ERK 

The  ERK  MAP  kinases  are  activated  in  a  CagA  phosphorylation-dependent 
manner  upon  infection  with  H.  pylori  cagA  positive  strains,  leading  to  a  SHP-2  dependent 
change  in  cell  motility  and  morphology  (246).  Indeed,  inhibition  of  the  phosphorylation 
of  CagA,  knockdown  of  SHP-2  expression,  or  the  disruption  of  the  CagA/SHP-2  complex 
abrogates  cell  elongation,  thus,  indicating  that  the  “hummingbird  phenotype”  is  a  product 
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of  the  SHP-2/CagA  complex  (141,  143,  144,  308).  In  fact,  this  complex  activates  the 
ERK  pathway  by  activating  Rapl-*B-Raf->ERK  and  has  been  proven  to  activate  ERK  in 
both  a  Ras-independent  and  dependent  manner  (141).  In  addition,  CagA  promotes  cell 
proliferation  through  activation  of  ERK,  which  subsequently  promotes  progression 
through  the  cell  cycle  (299,  365). 

Recent  data  indicate  that  the  phosphorylation  status  of  CagA  may  act  as  a 
signaling  switch  between  the  JAK/STAT3  and  SHP-2/ERK  pathways.  This  process  is 
mediated  through  gpl30  (185).  Unphosphorylatcd  Cag  activates  STAT3,  while 
phosphorylated  CagA  preferentially  activates  ERK1/2  phosphorylation  (185).  This 
differential  activation  based  on  phosphorylation  status  illustrates  the  complexity  of  the 
effects  that  CagA  has  on  the  host  cell. 

Non-ERK  Mediated  Cytoskeletal  Rearrangement  and  Scattering 

Important  steps  in  the  creation  of  elongated  cells  include  the  decrease  in  cellular 
adhesions  and  the  deregulation  of  the  actin-binding  proteins  that  maintain  proper  cellular 
shape  (48,  233).  Phosphorylated  CagA  binding  and  activation  of  SHP-2  leads  to 
increased  tyrosine  dephosphorylation  of  the  focal  adhesion  kinase  (FAK;  366),  which  is 
important  for  elongation  of  host  cells;  when  dominant-negative  FAK  is  expressed,  host 
cells  change  morphology,  while  constitutively  active  FAK  abrogates  the  formation  of  this 
morphological  change  (366).  Additionally,  studies  have  identified  multiple  actin-binding 
proteins  that,  when  tyrosine-dephosphorylated,  promote  CagA  phosphorylation 
dependent  cell  elongation.  These  include  vinculin  (232),  cortactin  (309,  312),  and  ezrin 
(310).  Since  the  SHP-2  phosphatase  is  not  required  for  the  dephosphorylation  of 
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cortacin,  the  dephosphorylation  of  these  actin-binding  proteins  is  likely  a  result  of 
blocked  activity  of  a  kinase,  and  therefore  a  product  of  the  phosphorylated  CagA  negative 
feedback  loop  that  inhibits  the  Src  kinase  (3 12).  In  fact,  this  is  the  case  for  ezrin  as 
inhibition  of  Src  family  kinases  increases  dephosphorylation  of  ezrin  (310),  and  host 
elongation  is  achieved  simply  through  inactivation  of  Src,  which  results  in 
dephosphorylation  of  all  Src  subtrates,  including  vinculin,  ezrin,  and  cortactin  (28).  This 
inhibition  of  Src  could  occur  either  directly  or  through  the  recruitment  of  C-terminal  Src 
kinase  (Csk),  as  described  in  the  next  section  (28,  312,  365).  Finally,  phosphorylated 
CagA  can  bind  Crk  adaptor  proteins  (Crk-I,  Crk-II,  and  Crk-L;  344).  This  interaction  is 
important  for  cell  scattering,  disruption  of  E-cadherin/catenin,  and  activation  of  Raf 
(344).  Furthermore,  it  was  recently  shown  that  CagA  phosphorylation  could  occur  via 
Abl  instead  of  Src,  thereby  activating  downstream  effects,  specifically  cell  scattering  and 
motility  (284,  350).  Moreover,  Abl  could  also  form  a  complex  with  Crkll  and  CagA 
(350).  Taken  together,  these  findings  show  that  phosphorylation  of  CagA  is  very 
important  for  host  cell  shape  and  adhesion.  This  fact  implicates  the  degree  of 
phosphorylation  as  a  consequence  of  the  cagA  allele  carried  by  a  strain  as  being 
important  for  development  of  gastric  carcinomas. 

CagA  Feedback  Loop,  Src  vs.  Csk 

As  mentioned  above,  CagA  participates  in  a  negative  feedback  loop  that  allows 
regulation  of  the  amount  of  phosphorylated  CagA.  CagA  can  bind  Csk  via  direct 
interaction  with  the  EPIYA-A  and  -B  motifs  (365,  366).  Formation  of  this  complex  leads 
to  inhibition  of  the  Src  family  kinases  (SFKs)  through  Csk  tyrosine  phosphorylation  of  an 
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inhibitory  C-terminal  residue  on  Src  (138,  365,  366).  CagA  can  also  directly  inhibit  SFK 
activity  (3 12).  While  the  purpose  of  this  negative  feedback  system  is  not  completely 
clear,  it  appears  that  in  its  absence,  CagA  is  excessively  toxic  to  cells  (137,  138,  365). 
Thus,  this  loop  has  been  hypothesized  to  promote  long-term  colonization  of  cagA 
positive//,  pylori  strains  (137). 

Interactions  with  Unknown  Function 

Recent  proteomic  screens  identified  a  number  of  proteins  that  appear  to  interact 
with  phosphorylated  CagA  (313).  These  include  PI3K,  Grb2,  Ras-GAP,  Grb7,  and  Shpl. 
The  consequences  of  these  interactions  with  phosphorylated  CagA  is  still  unknown  (313). 
However,  it  is  clear  that  both  Grb2  and  PI3K  are  actively  involved  in  H.  pylori 
pathogenesis  when  CagA  is  not  phosphorylated  (227,  343).  Indeed,  unphosphorylated 
CagA  is  known  to  bind  to  Grb2,  which  activates  ERK  signaling,  and  leads  to  increased 
cellular  proliferation,  transcription  factor  activation,  cell  scattering  (227),  and  activation 
of  Akt  and  PI3K  signaling.  These  activation  events  subsequently  stimulate  both  the  P~ 
catenin  and  NF-kB  pathways  (343).  So  the  interaction  of  some  of  these  proteins  with 
phosphorylated  CagA  may  represent  redundant  mechanisms  of  action. 

CagA  Independent/Redundancy 

Inflammation  appears  important  for  H.  pylori  growth  in  vivo.  For  instance,  H. 
pylori  induced  inflammation  results  in  a  decrease  in  the  inhibitor  of  gastrin;  gastrin  has 
been  proven  to  be  a  H.  pylori  growth  factor  (41,61,  189,  239).  Given  the  importance  of 
inflammation  for  H.  pylori  colonization  and  persistence,  there  are  redundant  mechanisms 


30 


by  which  H.  pylori  induces  inflammation.  For  instance,  in  addition  to  CagA  effects,  NF- 
kB  activation  can  also  be  achieved  through  TLR4  recognition  of  LPS  (180,  258)  or 
through  type  IV  secretion  system  (T4SS)  delivered  peptidoglycan  (PG)  binding  to 
nucleotide-binding  oligomerization  domain  1  (NODI;  Fig.  3;  165,  372).  The 
inflammatory  response  caused  by  the  interaction  of  peptidoglycan-NODl  may  be  a  result 
of  the  activation  of  AP-1  (11),  which  functions  as  a  transcription  factor  for  cytokines  and 
chemokines  such  as  IL-8  (59,  97,  159,  290).  Recently,  it  has  also  been  demonstrated  that 
the  binding  of  NODI  to  its  ligand,  in  this  case  peptidoglycan,  activates  RICK,  which 
allows  RICK  to  interact  directly  with  TRAF3  followed  by  TRAF3  interaction  with  TBK1 
(378).  TBK1,  as  well  as  IKKs,  leads  to  the  production  of  cytokines,  such  as  the  type  1 
interferons  (IFN)  like  IFN-p.  The  production  of  IFN-P  is  responsible  for  NODl’s  ability 
to  increase  the  level  of  the  chemokine  IP- 10  as  well  as  the  induction  of  and  nuclear 
translocation  of  the  transcription  factor  interferon- stimulated  gene  factor  3  (ISGF3;  378). 
The  presence  of  the  cag  pathogenicity  island  has  also  been  shown  to  lead  to  increased 
inflammation  (57);  14  of  the  27  genes  within  the  island  are  essential  for  IL-8  induction 
(102).  With  multiple  bacterial  factors  that  induce  NF-kB,  this  begs  the  question  of  why 
the  bacterium  needs  mechanisms  for  such  redundancy.  Lamb,  et  al.  postulate  that  since 
CagA  and  peptidoglycan  target  different  cellular  signaling  molecules,  they  may 
synergistically  activate  NF-kB,  and  that  this  synergy  may  be  important.  Alternately,  in 
strains  where  CagA  is  not  present  or  is  not  a  potent  inducer  of  IL-8  and  NF-kB, 
peptidoglycan  may  serve  as  the  major  inducer  of  the  inflammatory  response  (180). 

Both  of  these  hypotheses  highlight  the  importance  of  the  induction  of  an  inflammatory 
response  for  H.  pylori,  probably  due  to  the  requirement  for  gastrin  or  other  nutrients. 
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Figure  3:  Vac  A  and  known  host  cell  targets.  A.  A  schematic  representation  of  Vac  A 
with  the  three  major  regions  of  polymorphisms  (s,  i,  and  m)  is  shown.  Additionally, 
schematics  of  the  known  alleles  of  each  region  are  shown.  The  i  region  contains  two 
important  polymorphic  regions  known  as  Cluster  B  and  Cluster  C,  which  are  designated 
by  a  B  and  C,  respectively  on  the  diagram.  The  activity  attributed  to  each  of  the  regions 
of  the  toxin  (vacuolating  activity  or  cellular  tropism)  are  indicated,  and  the  impact  of 
each  allele  on  these  effects  is  shown.  The  highest  level  of  activity  or  the  broadest  tropism 
is  defined  as  ++,  intermediate  tropism  is  indicated  by  a  +,  low  activity  is  indicated  as  a 
+/-,  no  activity  is  designated  by  a  -,  and  incomplete  information  is  indicated  by  a  ?.  B.  A 
depiction  of  the  gastric  mucosa  and  known  host  pathways  targeted  by  VacA  is  shown. 
One  of  the  receptors,  sphingomylein  is  designated  by  SM.  Pathways  targeted  in 
epithelial  cells  and  B  and  T  cells  are  indicated.  Additionally,  activation  of  several 
pathways  by  peptidoglycan  (PG)  and  LPS  are  shown.  This  figure  was  adapted  from  an 
earlier  version  by  Rieder,  et  al.  (294). 
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Figure  3:  VacA  and  known  host  cell  targets 
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Vacuolating  Cytotoxin  A  -  VacA 

Vac  A  is  another  important  factor  that  has  been  indicated  to  have  effects  on  H. 
pylori  virulence  and  to  target  numerous  host  cell  pathways  (Fig.  3).  Activity  of  this 
protein  was  found  when  H.  pylori  filtrates  were  shown  to  induce  large  host  cell  vacuoles 
(188).  The  VacA  cytotoxin  appears  to  be  produced  and  secreted  by  most,  if  not  all,  H. 
pylori  strains,  but  possesses  no  similarity  to  any  other  known  bacterial  or  eukaryotic 
protein  (25,  67).  Once  produced,  VacA  can  remain  on  the  bacterial  surface  (156)  or  be 
secreted  as  an  approximately  88  KDa  toxin  (68).  Secreted  VacA  monomers  oligomerize 
(6,  69,  183,  199)  but  dissociate  upon  exposure  to  a  non-neutral  environment.  In  fact, 
exposure  to  alkaline  or  acidic  conditions  actually  amplifies  the  activity  of  VacA  (69,  77, 
234,  389).  Once  secreted,  VacA  undergoes  proteolytic  cleavage  to  yield  two  smaller 
products,  p33  and  p55.  However,  to  date  the  consequence  of  this  cleavage  is  not 
understood  (252,  354,  362,  385,  400).  The  smaller  p33  product  and  about  100  amino 
acids  of  p55  are  responsible  for  the  vacuolating  activity  of  VacA  (75,  76,  402).  The  p33 
domain  is  strongly  hydrophobic  and  contains  characteristic  transmembrane  dimerization 
motifs  that  are  responsible  for  insertion  into  the  host  cellular  membrane  and  vacuolating 
activity  (169,  221,  223,  374,  400,  401),  whereas  the  p55  domain  has  a  crucial  role  in 
binding  to  host  cells  (116,  266,  292,  376,  377). 

Like  CagA,  VacA  is  polymorphic.  However,  unlike  CagA,  this  variation  begins 
within  the  amino-terminus  of  VacA.  Three  regions  of  variation  have  been  defined  and 
there  are  at  least  two  primary  variants  in  each  region;  the  regions  are  designated  as  the 
signal  (s),  intermediate  (i),  and  middle  (m)  regions  (Fig.  3;  24,  62,  293).  The  s  region  of 
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VacA  is  found  in  the  p33  portion  of  the  toxin  and  influences  vacuolating  activity  and 
efficiency  of  anion  channel  formation  due  to  the  hydrophobic  nature  of  the  amino  acid 
residues  found  near  the  proteolytic  cleavage  site  (222,  286).  The  s2  variant  undergoes 
cleavage  at  an  alternate  site,  thereby  providing  an  extension  of  12  hydrophilic  amino 
acids  (24).  The  si  variant  contains  more  hydrophobic  amino  acids  near  the  cleavage  site 
than  the  s2  variant;  thus,  the  si  sequence  is  more  easily  inserted  into  the  host  cell 
membrane  (186,  222).  The  m  region  is  found  in  the  p55  portion  of  the  toxin  and 
influences  host  cell  tropism;  the  ml  region  is  toxic  to  a  wider  range  of  host  cells  (14,  161, 
266).  The  i  region  is  located  between  the  s  and  m  regions  and  is  the  most  recent  region  to 
be  described.  The  i  region  has  been  suggested  to  be  the  best  indicator  of  disease  severity 
(293)  and  three  primary  variants  have  been  identified  (62).  The  il  region  is  believed  to 
be  associated  with  stronger  vacuolating  activity  and  more  severe  disease  states  than  the  i2 
region  (293).  Furthermore,  strains  carrying  VacA  si,  il,  ml  or  any  combinations  of  these 
alleles  are  overall  associated  with  more  severe  disease  (34,  160,  187,  293).  This 
association  could  be  due  to  increased  anion  channel  formation,  vacuolating  activity,  and 
cell  tropism  from  having  the  si,  il,  and  ml  regions,  respectively. 

In  recent  years,  a  number  of  studies  have  elucidated  multiple  receptors  for  VacA 
and  shown  that  VacA  uses  different  receptors  based  on  different  host  cell  types  (315).  On 
epithelial  cells,  several  different  receptors  have  been  identified.  Among  these  are 
RPTPa,  which  is  a  receptor-like  protein  tyrosine  phosphatase  that  appears  to  be  used  by 
VacA  on  G401  cells  (a  human  kidney  tumor  cell  line)  and  in  AGS  cells  (an 
adenocarcinenoma  cell  line;  353,  390).  Another  receptor  which  needs  to  be  glycosylated 
for  VacA  to  bind  to  it  is  RPTPp,  which  can  be  used  by  VacA  on  AZ-521  (gastric 
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epithelial-derived  cells;  389,  391).  When  RPTPP  is  artificially  increased  in  some  cell 
lines,  toxicity  to  VacA  also  increases  (265).  The  importance  of  this  receptor  in  vivo  has 
been  demonstrated  in  RPTPP  knock  out  mice,  which  become  resistant  to  VacA  mediated 
ulceration  (111).  Additionally,  sphingomyelin  was  recently  identified  as  a  receptor  that  is 
important  for  VacA  binding  and  vacuolating  activity  of  the  toxin  (131,  132).  Finally, 
VacA  can  also  bind  to  T-cells  using  the  lymphocyte  function-associated  antigen- 1  (LFA- 
1 ;  3 16).  The  fact  that  VacA  can  use  different  receptors  based  on  the  cell  type  targeted 
may  help  explain  this  toxin’s  diverse  functions.  While  much  is  known  about  the  various 
functions  of  the  toxin,  relatively  little  is  currently  known  about  the  exact  host  signaling 
pathways  affected  by  the  toxin.  Thus,  herein  we  discuss  what  is  currently  known  about 
the  major  cellular  processes  affected  by  VacA. 

VacA  Functions 

Anion  Channel  Formation  and  Vacuolation 

VacA  can  oligomerize  within  the  plasma  membrane  and  can  cause  formation  of 
anion-selective  channels  (346).  These  channels  may  be  used  to  increase  the  efflux  of 
complex  molecules,  such  as  bicarbonate  and  urea,  out  of  the  host  cell  (78,  346,  361), 
which  may  aid  H.  pylori  growth  (237).  In  addition  to  forming  anion-selective  channels  in 
vitro,  VacA  can  reduce  the  transepithelial  electrical  resistance  of  polarized  cells  by 
increasing  paracellular  epithelial  penneability.  This  allows  the  release  of  some  cations, 
such  as  Fc"  and  Ni2+,  as  well  as  more  complex  molecules  such  as  amino  acids  and  sugars 


(268). 
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One  of  the  most  striking  effects  of  VacA  on  host  cells  is  the  creation  of  large 
cytoplasmic  vacuoles  that  contain  the  markers  for  late  endosomes  and  lysosomes  (235). 
Once  VacA  is  internalized  by  the  host  cell,  it  is  trafficked  to  the  early  endosome  by  F- 
actin  containing  structures  (118).  Subsequently,  the  CD2-associated  protein  is  essential 
for  transferring  VacA  from  early  endosomes  to  late  endosomes  (117,  118).  The  process 
of  vacuolation  is  then  dependent  on  syntaxin  7  and  vesicle  associated  membrane  protein 
7  (VAMP7),  both  of  which  are  integral  to  late  endosomes  and  lysosomes  (220,  341). 
Additionally,  this  process  requires  the  vacuolar  ATPase  (V-ATPase)  activity  and 
dynamin,  which  are  enzymes  crucial  for  formation  and  stability  of  vesicles,  (70,  342). 

Induction  of  Apoptosis 

Although  the  fact  that  VacA  causes  apoptosis  has  been  known  for  a  while,  the 
exact  mechanism  or  mechanisms  by  which  this  occurs  are  still  not  completely 
understood.  Evidence  shows  that  VacA-mediated  apoptosis  is  dependent  on  interaction 
with  the  mitochondria  (79,  103,  1 15,  172,  259,  383,  384).  Indeed,  VacA  has  been  proven 
to  reduce  the  membrane  potential  of  the  mitochondria,  thereby  allowing  the  release  of 
cytochrome  c  (1 15,  172,  383,  384).  The  modulation  of  the  mitochondrial  membrane 
potential  by  VacA  also  results  in  impaired  cell  cycle  progression  and  a  drop  in  ATP 
concentration  (172).  Several  early  studies  showed  that  VacA  that  is  deficient  in  its  ability 
to  fonn  channels  inhibits  cytochrome  c  release  (383,  384),  and  blocks  the  ability  to 
modulate  the  mitochondrial  membrane  potential  (383),  suggesting  that  channel  fonnation 


is  essential  for  these  events. 
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However,  some  additional  work  has  demonstrated  that  most  Vac  A  is  localized  to 
vacuoles  inside  host  cells  (397).  This  finding  suggests  that  VacA-mediated  cell  death 
might  not  be  a  result  of  direct  binding  of  VacA  to  the  mitochondria,  but  perhaps  suggests 
that  VacA-mediated  induction  of  the  pro-apoptotic  factors  in  the  Bcl-2  family  might  be 
involved.  In  keeping  with  this,  it  has  been  suggested  that  these  pro-apoptotic  factors 
actually  interact  with  the  mitochondria  to  release  cytochrome  c,  and  VacA  has  been 
shown  to  increase  the  level  of  pro-apoptotic  Bax  in  a  manner  that  mirrors  the  release  of 
cytochrome  c  from  the  mitochondria  (397).  Additionally,  VacA  can  also  induce  the 
cleavage  of  poly  (ADP-ribose)  polymerase  (PARP),  by  the  activation  of  the  death  factor, 
caspase-3  in  transfected  cells.  Furthermore,  this  cleavage  can  be  inhibited  by  the 
overexpression  of  the  pro-apoptotic  factor  Bcl2  (115,  397).  Taken  together,  these  data 
suggest  that  VacA  has  two  potential  mechanisms  to  induce  apoptosis  in  intoxicated  cells. 

Disruption  of  Cellular  Pathways 

VacA  deregulates  multiple  cellular  pathways  as  well  as  inducing  inflammation. 
VacA  intoxication  induces  production  of  a  variety  of  inflammatory  cytokines  that  include 
TNFa,  IL-ip,  IL-6,  IL-10,  and  IL-13  (339).  Moreover,  IL-8  is  produced  by  several 
different  cell  lines  in  response  to  VacA-mediated  activation  of  the  p38  MAPK  through  an 
increase  in  intracellular  calcium  and  the  subsequent  activation  of  ATF-2,  CREB,  and  NF- 
kB  (147). 

VacA  increases  the  activity  of  p38,  ERK,  and  the  activating  transcription  factor  2 
(ATF-2;  244).  Through  the  p38/ATF-2  cascade,  COX -2  is  upregulated,  which  leads  to 
increased  production  of  prostaglandin  E2  (PGE2;  148).  Conversely,  in  mice  VacA 
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inhibits  PGE  2-stimulated  duodenal  epithelial  bicarbonate  (HCO3')  secretion  by  inducing 
the  release  of  mucosal  histamine  (368).  While  the  reason  for  this  discrepancy  in  VacA- 
mediated  increase  in  PGE2  effects  is  unclear,  it  should  be  noted  that  decreased  duodenal 
epithelial  HCO3"  secretion  is  associated  with  duodenal  ulcers  and  may  leave  the  mucosal 
layer  less  able  to  repair  itself  (157).  This  could  account  for  the  role  of  VacA  in  gastric 
damage  (157).  Furthermore,  VacA  can  also  inhibit  gastric  acid  secretion  by  increasing 
the  mobilization  of  intracellular  calcium,  which  in  turn  activates  calpain  1  to  hydrolyze 
the  CagA  targeted  cytoskeletal  protein,  ezrin  (310,  375).  As  well  as  the  inflammatory 
pathway,  VacA  activates  the  p38  and  the  ERK  pathways  leading  to  deregulation  of 
molecules  that  directly  correlate  with  gastric  damage. 

Like  CagA,  VacA  has  also  been  shown  to  affect  the  P-catenin  signaling  pathway 
and  therefore,  perhaps  the  oncogenic  potential  of  H.  pylori.  As  stated  earlier, 
deregulation  of  this  molecule  affects  many  cellular  pathways  involved  in  migration,  cell 
cycle,  polarity,  and  apoptosis,  and  numerous  studies  have  demonstrated  the  effect  that  H. 
pylori  has  on  the  P-catenin  pathway  (179,  241,  243,  329,  347).  Recently,  Tabassam,  et 
al.  showed  that  VacA  activates  PI3K/pl  10a,  which  in  turn  activates  Akt  to  phosphorylate 
GSK3p.  This  phosphorylation  ultimately  frees  P-catenin  to  translocate  to  the  nucleus  to 
bind  TCF/LEF  and  allows  transcription  of  P-catenin  dependent  genes  such  as  cyclin  D 1 
and  potentially  other  oncogenic  genes  such  as  cdxl  (193,  229,  347).  Thus  this  affect  on 
the  P-catenin  pathway  again  lends  credence  to  the  fact  that  the  more  virulent  alleles  of 
VacA,  which  potentially  could  cause  greater  induction  of  this  pathway,  are  associated 


with  more  severe  disease  manifestations. 
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Finally,  Vac  A  also  affects  the  autophagy  pathway;  Vac  A  induces  the  formation  of 
autophagosomes  and  is  associated  with  these  structures  (355).  Moreover,  the  stability  of 
intracellular  Vac  A  is  impacted  by  the  presence  of  autophagosomes,  and  Vac  A  stability  is 
increased  when  autophagy  is  inhibited  (355). 

Immune  Modulation 

VacA  has  many  roles,  but  one  important  function  that  may  directly  impact  H. 
pylori  colonization  and  persistence  is  its  ability  to  act  as  an  immune  modulator.  This 
immune  modulation  occurs  through  several  distinct  mechanisms.  For  instance,  VacA 
disrupts  the  process  of  phagosome  maturation  through  recruitment  and  retention  of 
coronin  1,  which  is  also  known  as  tryptophan-aspartate-  containing  coat  protein  (TACO; 
408).  However,  despite  this  disruption  in  phagosome  maturation,  VacA  does  not  seem  to 
impact  the  intracellular  survival  of  II.  pylori  within  monocytes  (296).  Next,  II.  pylori 
infected  macrophages  form  large  vesicular  compartments  called  megasomes  and  VacA 
supports  this  process  by  increasing  homotypic  phagosome  fusion  (10,  408).  This  allows 
H.  pylori  to  persist  in  macrophages  instead  of  being  killed.  VacA  has  also  been  proven  to 
inhibit  the  invariant  chain  dependent  pathway  of  antigen  presentation  by  MHC  class  II 
molecules  (236),  and  has  been  reported  to  interfere  with  the  presentation  of  antigen  in  B 
cells  (236).  More  recently,  VacA  has  been  reported  to  inhibit  both  PMA/anti-IgM  and  T 
cell  induced  B  cell  proliferation  (363). 

In  addition  to  these  pathways,  T  cells  are  also  affected  by  VacA.  VacA  can  enter 
activated,  migrating  primary  T  lymphocytes  by  binding  to  P2  integrin  (CD  1 8)  and  LFA- 1 
(316);  LFA-1  is  essential  for  this  process  since  T  cells  deficient  in  LFA-1  are  resistant  to 
the  effects  of  VacA  (316).  Intoxication  by  VacA  can  then  inhibit  the  proliferation  of 
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CD8+  T  cells  (363)  through  down-regulation  of  the  expression  of  the  interleukin  2  (IL-2) 
surface  receptor-a  and  inhibition  of  the  production  of  IL-2,  both  of  which  are  required  for 
T-cell  proliferation  and  survival.  These  IL-2  effects  occur  through  the  inhibition  of 
NFAT  (45,  1 19,  338).  This  disruption  in  normal  NFAT  signaling  may  be  due  to  blocked 
dephosphorylation  of  NFAT,  which  could  occur  by  blocking  the  influx  of  calcium  that  is 
required  for  dephosphorylation  by  the  calcium-calmodulin-  dependent  phosphatase 
calcineurin  and  subsequent  nuclear  translocation  of  NFAT  (45,  1 19).  The  down- 
regulation  of  IL-2  decreases  cyclins  D3  and  E,  which  in  turn  decreases  production  of  the 
retinoblastoma  protein.  This  decrease  induces  cell  cycle  arrest  in  the  Gi  phase  (119). 

With  expression  of  over  100  genes  altered  in  T  cells  upon  intoxication  with  Vac  A 
(119),  it  is  perhaps  not  surprising  that  some  redundancy  exists  in  this  process.  A  recent 
study  found  that  Vac  A  inhibition  of  CD4+  T  cell  proliferation  is  independent  of  NFAT 
induced  IL-2  activation  (338).  Furthennore,  VacA  can  induce  the  p38  MAPK  pathway 
within  T  cells,  neutrophils  and  macrophages  (45).  In  T  cells,  p38  is  activated  through 
activation  of  serine-threonine  kinases  (MKK3/6),  which  are  linked  to  signaling 
molecules  through  a  Rho  family  GTPase  exchange  factor,  Vav  (45,  53,  263).  Through  its 
exchange  activity  on  Rac,  Vav  is  linked  to  the  reorganization  of  the  cytoskeleton  (53), 
and  VacA  uses  Rac  1  to  rearrange  the  host  cell  cytoskeleton  (45,  151).  Taken  together, 
all  of  these  immune  modulations  by  VacA  probably  allow  for  persistent  infection  with  H. 
pylori.  Additionally,  perhaps  the  importance  of  immune  modulation  has  led  to  the 
selective  pressure  to  maintain  expression  of  VacA  in  most  H.  pylori  strains. 
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Interactions  between  CagA  and  VacA 

There  is  a  growing  amount  of  literature  that  suggests  that  the  CagA  and  VacA 
toxins  interact,  and  that  this  interaction  has  an  effect  on  disease  severity  (160).  Early  on, 
Yokoyama,  et  al.  showed  an  antagonistic  effect  between  CagA  and  VacA  on  the  NFAT 
pathway  (404);  CagA  activates  the  NFAT  pathway  via  activation  of  calcineurin  through 
phospholipase  Cy,  whereas  VacA  inhibits  NFAT  through  prevention  of  calcineurin 
activation  through  decreased  calcium  influx  due  to  VacA  mediated  pores  (404). 

Moreover,  recent  transfection  assays  showed  that  CagA  blocks  the  apoptotic  activity  of 
VacA  by  two  different  mechanisms  (259);  phosphorylated  CagA  blocks  the  ability  of 
VacA  to  traffic  to  intracellular  compartments,  whereas  unphosphorylated  CagA  blocks 
apoptosis  in  a  manner  that  mimics  Bcl2  (an  anti-apoptotic  factor)  overexpression  (259). 
However,  Bcl2  expression  was  not  shown  to  be  increased  by  CagA  (259).  In  fact,  CagA 
not  only  blocks  the  cytotoxicity  of  VacA,  but  also  blocks  the  ablity  of  VacA  to  enter  host 
cells  (8). 

Additionally,  VacA  and  CagA  show  antagonistic  activities  in  regards  to  cellular 
morphology.  In  cells  co-cultured  with  isogenic  H.  pylori  mutant  strains  deficient  in  cagA 
or  vac  A,  increased  vacuolation  was  seen  in  cells  infected  with  cagA  mutants,  whereas 
cells  infected  with  vacA  mutants  showed  greater  elongation  of  cells  (21).  In  other  words, 
protrusion  length  was  reduced  in  cells  displaying  vacuoles,  and  the  number  of  vacuoles 
was  decreased  in  elongated  cells  (21).  At  a  mechanistic  level,  activation  of  ERK1/2  by 
CagA  is  important  for  cell  scattering  and  morphological  changes  (246),  and  VacA 
inhibits  activation  of  ERK1/2  through  inhibition  of  the  activation  of  the  epidermal  growth 
factor  receptor  (EGFR)  and  the  human  epidennal  growth  factor  receptor  2  (HER2/Neu; 
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353).  One  explanation  for  this  antagonism,  which  was  suggested  by  Akada,  et  al,  is  that 
CagA  is  injected  into  the  cells  that  the  bacteria  are  attached  to,  which  then  protects  those 
cells  from  the  cytotoxic  ativity  of  Vac  A.  VacA  then  proceeds  to  attack  distant  cells, 
thereby  freeing  nutrients  (8).  Overall,  these  combined  interactions  may  explain  our 
observation  of  a  link  between  the  most  active  vacA  allele  (sl/il/ml),  the  most  pathogenic 
cagA  allele  (EPIYA-ABD),  and  more  severe  disease  manifestations  (160). 

Conclusion 

H.  pylori  is  a  medically  important  bacterium  that  possesses  a  wide  variety  of 
virulence  factors  that  allow  it  to  thrive  in  the  hostile  environment  of  the  stomach.  The 
causal  link  between  H.  pylori  infection  and  gastric  cancer  development  has  led  to 
numerous  studies  designed  to  ascertain  the  role  of  virulence  factors  in  the  establishment 
of  disease  (40,  275,  276,  349).  Some  virulence  factors,  such  as  HomB  (163,  260)  and 
BabA  (121,  230)  are  just  beginning  to  be  linked  via  epidemiological  evidence  to  support 
a  role  in  the  development  of  more  severe  disease.  Conversely,  CagA  (42,  152,  178,  274, 
388)  and  VacA  (34,  160,  187,  293,  388)  have  been  studied  extensively. 

CagA  was  the  first  H.  pylori  virulence  factor  to  be  associated  with  more  severe 
disease  (42,  65,  72,  133)  and  has  been  shown  to  affect  cellular  processes  that  include  P- 
catenin  (240),  ERK  (141,  143,  144,  227),  and  the  inflammatory  pathways  (49,  170)  to 
name  a  few  (Figure  1).  This  toxin  functions  in  both  a  phosphorylation-dependent  and 
-independent  manner,  and  polymorphisms  located  in  the  carboxyl  tenninus  lead  to 
differential  induction  of  several  cellular  pathways  (138,  142,  143).  Since  these 
polymorphisms  affect  the  phosphorylation  sites,  one  might  assume  that  these  variations 
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would  only  affect  pathways  that  are  CagA  phosphorylation  dependent.  However,  even  in 
its  unphosphorylated  state  the  sequence  differences  within  this  region  of  CagA  affect  the 
ability  of  the  protein  to  multimerize,  thereby  leading  to  differential  induction  of  CagA 
phosphorylation  independent  pathways  as  well  (179,  198).  Meanwhile,  a  multitude  of 
studies  have  assessed  the  effects  of  VacA  on  host  cells  (Fig.  3).  This  toxin  also  has  a  vast 
array  of  functions  that  span  induction  of  apoptosis  (115,  172,  383,  384,  397)  to 
modulation  of  the  immune  system  (Fig.  3;  236,  363,  408).  Again,  there  are 
polymorphisms  within  the  VacA  toxin  that  affect  the  range  of  cells  it  can  intoxicate  (161, 
266),  as  well  as  its  ability  to  integrate  into  membranes  and  cause  downstream  effects 
(186,  222,293). 

These  two  distinct  toxins  clearly  have  some  overlap  in  their  functions.  Both  are 
able  to  affect  cell  shape  (141,143-144,  227,  310,  375),  affect  immune  cells  (235,  363, 

371,  408-409),  and  activate  oncogenic  pathways  such  as  |3-catenin  (240,  347).  They  also 
clearly  have  antagonistic  effects  on  each  other,  such  as  dampening  the  phenotypic  effects 
on  the  host  cell  (cellular  elongation  induced  by  CagA  versus  vacuolation  caused  by 
VacA;  21).  CagA  also  has  the  ability  to  prevent  VacA  induced  apoptosis,  whereas  VacA 
can  prevent  CagA  induced  nuclear  translocation  of  NFAT  (404).  It  is  believed  that  this 
antagonistic  relationship  exists  to  increase  the  life  of  the  host  cell  (21),  and  it  has  been 
shown  that  the  more  active  form  of  VacA  is  often  associated  with  the  more  active  form  of 
CagA  and  is  thus,  further  linked  to  more  severe  gastric  disease  (160).  In  addition  to 
investigating  the  impacts  of  these  toxins  individually  on  host  cells,  more  knowledge  is 
needed  on  the  interaction  of  these  toxins  and  their  combined  impact  on  gastric  disease. 
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Goals  and  Specific  Aims 

The  goal  of  the  work  reported  within  this  dissertation  was  to  explore  the  role  of  the 
different  polymorphisms  within  CagA  and  VacA  in  the  development  of  gastric  disease 
(Chapters  Two,  Three,  and  Four).  At  the  same  time,  different  in  vitro  and  in  vivo  assays  were 
optimized  to  be  able  to  detect  biological  differences  among  the  different  polymorphic  forms 
of  CagA  (Chapter  Five).  In  order  to  accomplish  these  goals,  large  epidemiological  and 
molecular  epidemiological  studies  were  conducted.  Furthermore,  isogenic  strains  containing 
the  different  polymorphic  forms  of  CagA  were  created  and  characterized.  Although  the 
parental  strain  was  ultimately  found  to  contain  a  secondary  mutation,  the  optimized 
techniques  are  now  a  part  of  the  lab  protocol  repertoire.  Taken  together,  this  work  explores 
the  role  of  different  virulence  factors  in  the  development  of  severe  gastric  disease,  and 
provides  novel  information  on  the  hierarchy  of  interaction  of  these  virulence  factors. 
Ultimately,  this  information  could  be  used  to  create  more  efficacious  prevention  strategies  or 
treatment  options  for  gastric  cancer. 
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Abstract 

Helicobacter  pylori  causes  diseases  ranging  from  gastritis  to  peptic  ulcer  disease 
to  gastric  cancer.  Geographically,  areas  with  high  incidences  of  H.  pylori  infection  often 
overlap  with  areas  with  high  incidences  of  gastric  cancer,  which  remains  one  of  the 
leading  causes  of  cancer-related  death  worldwide.  Strains  of  H.  pylori  that  carry  the 
virulence  factor  cytotoxin-associated  gene  A  ( cagA )  are  much  more  likely  to  be 
associated  with  development  of  gastric  cancer.  Moreover,  particular  C-tenninal 
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polymorphisms  in  CagA  vary  by  geography,  and  have  been  suggested  to  influence 
disease  development.  We  conducted  a  large-scale  molecular  epidemiologic  analysis  of 
South  Korean  strains  and  herein  report  a  statistical  link  between  the  East  Asian  CagA, 
EPIYA-ABD  genotype  and  development  of  gastric  cancer.  Characterization  of  a  subset 
of  the  Korean  isolates  showed  that  all  strains  from  cancer  patients  expressed  and 
delivered  phosphorylatable  CagA  to  host  cells,  whereas  the  presence  of  the  cagA  gene  did 
not  strictly  correlate  to  expression  and  delivery  of  CagA  in  all  non-cancer  strains. 

Introduction 

Helicobacter  pylori  is  a  medically  important  pathogen,  and  although  infection 
rates  vary  geographically,  this  bacterium  colonizes  more  than  50%  of  the  world’s 
population  (1,  28).  This  spiral  shaped,  Gram-negative,  microaerophilic  bacterium 
chronically  inhabits  the  unforgiving  environment  of  the  stomach,  and  causes  subclinical 
gastritis  in  the  majority  of  patients.  However,  in  some  individuals,  H.  pylori  colonization 
results  in  peptic  ulcer  disease;  75%  of  gastric  ulcers  and  90%  of  duodenal  ulcers  are 
attributed  to  II.  pylori  infection  (20).  In  its  most  severe  sequelae,  II.  pylori  infection  can 
lead  to  the  development  of  two  forms  of  gastric  cancer:  adenocarcinoma  and  mucosa- 
associated  lymphoid  tissue  (MALT)  lymphoma  (12,  33,  34,  43).  The  association  of  H. 
pylori  with  stomach  cancer  led  the  World  Health  Organization  to  classify  it  as  a  class  I 
carcinogen  in  1994  (26).  It  currently  remains  the  only  bacterium  to  obtain  this  perilous 
distinction. 

Gastric  cancer  is  the  second  most  common  cause  of  cancer  death  worldwide,  and 
this  fact  could  be  reflective  of  the  high  incidence  of  H.  pylori  infection  (19,  30,  32,  47). 
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Interestingly,  geographic  areas  with  the  highest  level  of  gastric  cancer,  which  include 
most  East  Asian  countries,  also  have  the  highest  rate  of  H.  pylori  infection  (2,  19,  47). 
Additionally,  in  East  Asian  countries,  90%  of  strains  carry  the  cytotoxin-associated  gene 
A  ( cagA ;  27),  which  has  emerged  as  a  major  contributor  to  disease  severity.  In  fact, 
cagA -positive  H.  pylori  strains  are  at  least  twice  as  likely  to  cause  cancer  than  II.  pylori 
strains  without  cagA  (13,  22). 

cagA  is  carried  on  the  cag  pathogenicity  island  (PAI),  which  carries  genes  that 
produce  a  type  IV  secretion  apparatus  that  is  used  to  directly  inject  CagA  into  host  cells 
(16).  Within  the  cells,  CagA  is  phosphorylated  by  host  cell  kinases,  fonns  a  complex 
with  the  SHP-2  (Src  homology  region  2-containing  phosphatase  2;  25),  and  alters 
multiple  host  signaling  pathways  (23-25,  29,  36,  46).  The  phosphorylation  of  CagA 
occurs  in  the  carboxy  terminus  of  conserved  tyrosine  residues  that  are  part  of  a  repeated 
five  amino  acid  sequence  (Glu-Pro-Ile-Tyr-Ala)  referred  to  as  the  EPIYA  motif  (24,  25). 

Initial  studies  showed  that  CagA  proteins  from  various  H.  pylori  isolates  migrated 
differently  on  denaturing  gels  (18).  It  was  subsequently  shown  that  a  number  of  cagA 
alleles  exist,  and  that  variation  in  the  carboxy  terminus  of  the  protein  is  the  major 
difference  between  the  different  alleles.  Polymorphisms  in  the  C  tenninus  occur  in  the 
EPIYA  region  and  typically  involve  changes  in  the  amino  acid  sequences  flanking  the 
five  amino  acid  repeat.  The  most  common  motifs  have  been  designated  as  EPIYA-A,  -B, 
-C,  and  -D  (24),  and  are  found  in  two  distinct  combinations  by  geographic  location. 
Western  CagA  consists  of  a  combination  of  EPIYA-A,  -B,  and  -C  motifs  (up  to  five  -C 
motifs  have  been  identified),  whereas  East  Asian  CagA  contains  a  combination  of 
EPIYA-A,  -B,  and  -D  motifs  (5,  18,  24,  25,  31,  42). 
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EPIYA  -C  and  -D  serve  as  the  primary  CagA  phosphorylation  sites  and  are 
required  for  binding  to  SHP-2  (24).  Among  Western  isolates,  molecular  epidemiological 
studies  have  indicated  a  correlation  between  disease  severity  and  increased  number  of 
EPIYA  -C  motifs  (6,  24,  48).  Indeed,  in  cases  where  Western  strains  are  associated  with 
cancer,  most  have  multiple  EPIYA  -C  motifs  (9,  39).  This  increase  may  be  due  to 
elevated  morphological  transfonnation  as  a  result  of  increased  CagA  phosphorylation  and 
SHP-2  binding  (24). 

East  Asian  CagA  containing  the  EPIYA  -D  motif  demonstrates  higher  affinity  for 
SHP-2  than  Western  CagA.  This  leads  to  greater  morphological  changes  in  infected  cells 
(24),  as  well  as  greater  levels  of  inflammation  and  atrophy  (10).  These  findings,  along 
with  the  fact  that  East  Asian  strains  predominate  in  countries  with  the  highest  rates  of 
gastric  cancer,  suggest  that  East  Asian  CagA  may  have  the  potential  to  induce  more 
severe  forms  of  gastric  disease  (2,  19,  47). 

In  order  to  assess  the  correlation  between  cagA  genotype  and  H.  pylori- induced 
disease  severity,  we  examined  a  collection  of  isolates  from  South  Korea,  which  has  one 
of  the  highest  rates  of  H.  pylori  colonization  (44),  and  one  of  the  highest  rates  of  gastric 
cancer  in  the  world  (22,  41).  Additionally,  the  majority  of  South  Korean  strains  carry  the 
East  Asian  cagA  allele  (17).  Here  we  present  molecular  epidemiologic  evidence  that 
there  is  a  significant  association  between  the  development  of  gastric  cancer  and  infection 
with  H.  pylori  strains  carrying  the  EPIYA  -ABD  genotype. 
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Materials  and  Methods 

Bacterial  Strains 

Korean  bacterial  strains  along  with  G27-MA  (3),  and  its  isogenic  derivatives  G27- 
MA  A cagA  (4)  and  G27-MA  APAI  [provided  by  Manuel  Amieva  and  constructed  as 
described  in  Galgani,  et  al.  (21)],  were  cultured  as  previously  described  (15).  Briefly, 
bacterial  stocks  preserved  at  -80°C  were  grown  and  expanded  on  antibiotic  supplemented 
horse  blood  agar  (HBA)  plates.  Overnight  liquid  cultures,  brucella  broth  (BB;  Acumedia, 
Lansing,  MI)  containing  10%  fetal  bovine  serum  (Invitrogen,  Carlsbad,  CA)  and  10 
pg/ml  vancomycin  (Amresco,  Solon,  OH),  were  subcultured  to  an  optical  density  at  600 
nm  of  0.05  in  fresh  media  and  were  grown  for  18  hours  under  microaerophilic  conditions 
created  by  an  Anoxomat  evacuation/replacement  system  (Spiral  Biotech,  Norwood,  MA). 

Clinical  Isolate  Acquisition 

Isolates  were  obtained  from  patients  presenting  with  gastric  symptoms  to  the 
Division  of  Gastroenterology  in  the  Department  of  Internal  Medicine  at  the  College  of 
Medicine  of  The  Catholic  University  of  Korea  in  Seoul,  South  Korea.  Written  informed 
consent  was  received  from  each  patient,  and  the  protocol  was  approved  by  the 
Institutional  Review  Board  of  Human  Research  at  The  Catholic  University  of  Korea. 
Biopsies  were  collected  at  the  site  of  visible  mucosal  disturbance  and  histology  was 
performed  to  provide  a  diagnosis  along  with  culture  for  the  presence  of  Gram-negative, 
spiral  shaped  bacteria  that  produced  a  functional  urease  enzyme.  Subsequently,  a  single 
colony  isolate  was  selected  from  each  biopsy  for  further  characterization.  An  extensive 
breakdown  of  the  epidemiological  characteristics  of  the  patients  can  be  found  in  Table  1, 
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and  a  complete  list  of  strains  is  available  as  Table  SI  in  the  supplemental  material. 

Strains  are  named  such  that  the  letters  following  the  strain  number  indicate  the  disease 
state  as  follows:  cancer  (-CA),  duodenal  ulcer  (-DU),  gastritis  (-G),  gastric  ulcer  (-GU). 

cagA  Genotyping 

All  the  primers  used  in  this  study  are  listed  in  Table  2.  Genomic  DNA  was 
extracted  using  the  Easy  DNA  kit  (Invitrogen,  Carlsbad,  CA),  and  genotyping  of  the  C 
tenninus  of  cagA  was  performed  by  PCR  using  a  modified  version  of  that  developed  by 
Argent  et  al.( 8)  as  schematically  depicted  in  Fig.  4A,  4B,  and  Supplementary  Fig.  SI  in 
the  supplemental  material.  Briefly,  amplification  with  primers  cagA28F  or  cag2  and 
cagA-PIC  or  cagA-pA-1  (R)  identifies  an  EPIYA-A  motif.  Amplification  with  primers 
cagA28F  or  cag2  and  a  1 : 1  mixture  of  primers  cagA-P2TA  and  cagA-P2CG  indicates  an 
EPIYA-B  motif.  Amplification  using  primers  cagA28F  or  cag2  and  cagA-P3E  identifies 
either  the  presence  of  an  EPIYA-C  or  an  EPIYA-D  motif,  and  an  additional  amplicon 
with  cagA28F  or  cag2  and  the  unique  cagA-pD  (R)  primer  categorizes  the  CagA  as 
having  an  EPIYA-D  motif.  In  some  cases  where  the  PCR  amplification  was  inconclusive 
and  to  confirm  results  of  the  PCR  genotyping,  cagA  was  amplified  with  cag2  and  either 
cagA  seq  (R)  or  grace2  (which  lies  within  the  downstream  conserved  glutamate  racemase 
gene)  and  then  sequenced  using  primers  cag2  and  cagA  seq  (R).  Sanger  dideoxy 
sequencing  was  perfonned  by  the  Unifonned  Services  University  of  the  Health  Science 
Biomedical  Instrumentation  Center  (Bethesda,  MD)  or  at  Cosmo  Genetech  Co,  Ltd 
(Seoul,  Korea).  The  resulting  DNA  sequences  were  analyzed  using  Vector  NTI  version 
9.1  (Invitrogen)  and  Sequencher  4.5  (Gene  Codes  Corp.,  Ann  Arbor,  MI). 
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Table  1:  Primer  Sequences 


Primer 

Sequence 

Reference 

cag  2* 

GGAACCCTAGTCGGTAATG 

(37) 

cagA28F 

TTCTC  AAAGGAGC  AATTGGC 

(8) 

cagA-PIC 

GTCCTGCTTTCTTTTTATTAACTTKAGC 

(8) 

cagA-pAl  (R) 

CTTGTCCTGYTTTCTTTTTATTAAC 

This 

Study 

cagA-P2TA 

TTTAGCAACTTGAGTATAAATGGG 

(8) 

cagA-P2CG 

TTTAGCAACTTGAGCGTAAATGGG 

(8) 

cagA-P3E 

ATC  AATT  GTAGCGTAAAT  GGG 

(8) 

cagA-pD  (R) 

TT  GATTTGCCTC  AT  C  AAAATC 

This 

Study 

cagA  seq  (R)* 

TGGTTGAATCCAATTTTATC 

This 

Study 

grace2 

TCATGCGAGCGGCGATGT 

This 

Study 

*Also  used  for  sequencing 
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CagA  Protein  Expression 

After  24  hours,  lawns  of  bacteria  were  harvested  from  horse  blood  agar  plates, 
pelleted,  resuspended  in  1  X  phosphate  buffered  saline  (PBS),  and  mixed  with  5  X 
Laemmli  sample  buffer.  Bacterial  lysates  were  separated  by  sodium  dodecyl  sulfate- 
polyacrylamide  gele  electrophoresis,  using  a  10%  separating  gel  and  a  4%  stacking  gel, 
and  proteins  were  transferred  to  nitrocellulose  membranes  using  a  semidry  transfer 
apparatus  (OWL,  Thermo  Scientific,  Rochester,  NY).  Membranes  were  probed  with  a 
1:5,000  dilution  of  mouse  immunoglobulin  G1  (IgGl)  anti-CagA  monoclonal  antibody 
(Austral  Biologicals,  San  Ramon,  CA)  followed  by  a  1:20,000  dilution  of  horseradish 
peroxidase  (HRP)-conjugated  goat  anti-mouse  IgG  secondary  antibody  (Santa  Cruz 
Biotechnology,  Santa  Cruz,  CA).  Alternatively,  membranes  were  probed  with  a  1:5,000 
dilution  of  rabbit  IgG  anti-CagA  polyclonal  antibody,  B-300  (Santa  Cruz  Biotechnology) 
followed  by  a  1:20,000  dilution  of  HRP  conjugated  bovine  anti-rabbit  IgG  secondary 
antibody  (Santa  Cruz  Biotechnology).  Proteins  were  detected  using  the  Pierce  ECL 
Western  Blotting  Substrate  kit  (Thermo  Scientific/Pierce,  Rockford,  IL)  and 
photographic  film  with  a  Series  XXXV  A  Rapid  Processor  (S&W  Imaging,  Frederick, 
MD)  or  using  the  Super  Signal  West  Pico  Chemiluminescent  Substrate  kit  (Thermo 
Scientific/Pierce)  and  a  LAS-3000  Intelligent  Dark  Box  with  LAS-3000  Lite  capture 
software  (FujiFilm,  Stamford,  CT). 

CagA  Phosphorylation  Assays 

The  phosphorylation  assays  were  essentially  conducted  as  previously  described 
(14).  Briefly,  six-well  tissue  culture  plates  were  seeded  with  3.5  X  103  AGS  cells  per 
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well  and  allowed  to  grow  for  three  days  in  normal  cell  culture  media,  Dulbecco’s 
Modified  Eagle’s  Media  without  L-Glutamine  (Quality  Biological,  Inc.,  Gaithersburg, 
MD)  supplemented  with  10%  fetal  bovine  serum,  10  pg/ml  vancomycin,  and  2  nM  L- 
glutamine  (Quality  Biological,  Inc.).  Two  hours  prior  to  infection,  AGS  cells  were 
washed  with  1  X  PBS,  and  3  ml  of  fresh  media  were  added  to  each  well.  Liquid  cultures 
of  H.  pylori  were  resuspended  in  1  ml  of  1  X  PBS,  and  used  to  infect  the  AGS  cells  at  an 
MOI  of  100.  Infections  were  allowed  to  proceed  for  5  hours  at  which  point  the  media 
was  removed,  and  the  cells  were  washed  with  1  X  PBS  and  lysed  with  5  X  Laemmli 
sample  buffer.  Infected  cell  lysates  were  separated  by  sodium  dodecyl  sulfate- 
polyacrylamide  gel  electrophoresis,  using  a  6%  separating  gel  and  a  4%  stacking  gel. 
Proteins  were  transferred  to  nitrocellulose  membranes  by  semidry  transfer.  Membranes 
were  then  probed  with  a  1:5,000  dilution  of  an  anti-phospho-tyrosine  monoclonal 
antibody,  pYlOO  (Cell  Signaling  Technology,  Danvers,  MA)  followed  by  a  1:20,000 
dilution  of  HRP  conjugated  goat  anti-mouse  IgG  secondary  antibody  and  detection  as 
described  above.  Membranes  were  subsequently  stripped  (using  a  heated  10-mM 
dithiothreitol  solution)  and  re-probed  with  polyclonal  anti-CagA  antibody,  B-300  as 
described  above.  Densitometry  was  performed  using  MultiGauge  software  (FujiFilm, 
Stamford,  CT). 

A  low  level  of  cross-reactivity  for  CagA  with  the  anti-phospho-tyrosine 
antibodies  (pYlOO  and  pY99;  BD  Biosciences,  San  Jose,  CA)  was  observed.  Hence 
bacterial  lysates  were  run  adjacent  to  their  corresponding  infected  cell  lysates  so  that  any 
cross  reactivity  could  be  accounted  for  when  comparing  the  ratio  of  phosphorylated 
CagA  to  total  CagA  of  infected  cell  lysates  (Table  3). 


134 


Cell  Elongation  Studies 

The  cell  elongation  studies  were  essentially  conducted  as  previously  described 
(14).  Six-well  tissue  culture  plates  were  seeded  with  2.7  X  ICh  AGS  cells  per  well  and 
allowed  to  grow  in  normal  cell  culture  media  for  22  hours.  After  22  hours  and 
approximately  two  hours  before  infection,  the  media  was  removed,  the  cells  were  washed 
with  1  X  PBS,  and  1  ml  of  fresh  cell  culture  media  supplemented  with  10%  BB  was 
added  to  each  well.  Eighteen-hour  liquid  cultures  of  H.  pylori  that  were  suspended  in  1 
ml  of  the  BB  supplemented  cell  culture  media  were  then  used  to  infect  at  an  MOI  of  100. 

Infections  were  allowed  to  proceed  for  nine  hours,  at  which  point  cells  were  fixed 
with  2%  parafonnaldehyde  in  100  mM  phosphate  buffer  (pH7.4),  and  stained  with 
Giemsa  (Sigma-Aldrich,  Inc,  St.  Louis,  MO)  per  manufacture’s  directions.  Cells  were 
analyzed  using  an  Olympus  BX60  (Olympus  America  Inc,  Center  Valley,  PA)  and  were 
digitally  photographed  using  a  Spot  RT  color  camera  (Diagnostic  Instruments,  Sterling 
Heights,  MI).  One  hundred  cells  were  counted  to  assess  the  number  of  cells  displaying 
the  “hummingbird”  phenotype,  which  is  characterized  by  the  presence  of  finger-like 
protrusions  (40).  In  each  case,  the  infections  and  analysis  were  replicated  to  verify  the 
reproducibility  of  the  results.  Any  strain  that  on  average  induced  greater  than  60%  of  the 
AGS  cells  to  display  the  “hummingbird”  phenotype  in  biologically  independent 
experiments  was  considered  positive  for  the  presence  of  a  functional  CagA. 

IL-8  Induction  Assay 

Six- well  tissue  culture  plates  were  seeded  with  4.2  X  105  AGS  cells  per  well  and 
allowed  to  grow  in  normal  cell  culture  media  for  24  hours.  At  this  point  the  media  was 
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removed  and  replaced  with  media  lacking  serum  for  a  period  of  24  hours  before  H.  pylori 
infection.  Approximately  two  hours  prior  to  infection,  the  cells  were  washed  with  1  X 
PBS,  and  1  ml  of  fresh  media  without  serum  was  added  to  each  well.  Eighteen-hour 
liquid  cultures  of  H.  pylori  were  pelleted  and  resuspended  in  700  pi  of  cell  culture  media 
without  serum  and  used  to  infect  the  semi-confluent  AGS  cells  at  an  MOI  of  100.  After 
five  hours,  the  cell  culture  supernatant  was  collected,  samples  were  centrifuged  at  16,100 
relative  centrifugal  force  for  10  minutes,  and  the  supernatant  was  transferred  to  a  new 
tube,  and  stored  at  -20°C  until  later  use.  Human  interleukin-8  (IL-8)  concentration  was 
measured  using  an  enzyme-linked  immunosorbent  assay  kit  (R&D  Systems,  Minneapolis, 
MN)  following  the  manufacturer’s  direction.  The  change  in  IL-8  concentration  was 
calculated  in  comparison  to  G27-MA  APAI.  An  independent  biological  repeat  of  each 
infection  was  conducted,  and  strains  were  considered  to  induce  IL-8  if  the  average  fold 
change  was  more  than  10  fold. 

Statistical  Analysis 

The  Lisher  exact  test  was  used  to  analyze  the  association  between  disease  state 
and  EPIYA  motif  genotype.  Log  linear  modeling  was  used  to  assess  whether  this 
association  was  consistent  across  the  age  and  sex  subgroups.  We  fit  a  saturated  model 
using  categorical  variables  representing  genotype,  disease  state,  sex,  and  age  groups.  A 
backwards  selection  algorithm  identified  two-  and  three-way  association  among  these 
variables  that  were  statistically  significant  at  the  5%  level.  Data  was  analyzed  using 
SPSS  version  14  software  (SPSS  Inc.,  Chicago,  IL). 
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Nucleotide  Sequence  Accession  Numbers 

The  sequences  of  the  C-tenninal  region  of  CagA  from  47  strains  have  been 
deposited  in  GenBank,  accession  numbers  FJ4581 17-FJ458163. 

Results 

Sample  Acquisition/cagA  Genotyping 

A  total  of  260  H.  pylori  clinical  isolates  were  obtained  from  patients  presenting 
with  gastric  maladies  (Table  2).  Six  of  these  were  missing  the  epidemiological  data  of 
age  and  gender.  Of  the  remaining  254,  the  mean  patient  age  was  5 1  years,  with  an  age 
range  of  14  to  86  years.  There  were  126  females  (49.6%),  with  a  mean  age  of  52  years 
and  an  age  range  of  21  to  86  years,  and  128  males  (50.4%),  with  a  mean  age  of  50  years 
and  an  age  range  of  14  to  82  years.  Of  the  254  samples,  45.3%  were  from  patients  with 
gastritis,  43%  with  ulcers  (21.7%  gastric  ulcers  and  21.3%  duodenal  ulcers),  and  1 1.8% 
with  cancer. 

Four  different  PCR  reactions  were  conducted  for  each  strain  in  order  to  genotype 
cagA  (Fig.  4A  and  4B;  see  Fig.  SI  in  the  supplemental  material).  As  previously 
described,  three  of  these  PCR  reactions  identify  different  EPIYA  motifs;  one  identifies 
the  EPIYA- A  motif,  one  EPIYA-D  motif,  we  also  designed  and  employed  primer  cagA- 
pD  (R),  which  is  well  conserved  among  strains  carrying  the  EPIYA-D  motif  and  one  of 
the  first  primers  designed  to  specifically  amplify  the  EPIYA-D  motif.  Using  this 
technique,  we  were  able  to  genotype  234  strains  (Table  2).  These  strains  displayed  the 
same  age  range  as  the  full  collection  and  a  mean  age  of  50  years.  Again  the  proportion  of 
females  (1 12)  to  males  (116)  was  virtually  identical  to  the  larger  collection,  49.1%  to 


Table  2:  Epidemiological  Breakdown  of  die  Korean  Collection 


Disease  State 


Genotvved 

Duodenal 

Gastric 

Total 

Samples 

Gastritis 

Gastric  Ulcer 

Ulcer 

Cancer 

108** 

Overall  Total 

260* 

234*  (90%) 

(46%) 

42(18%) 

54***  (23%) 

30(13%) 

Age  Range 

14-86 

14-86 

19-82 

34-84 

14-72 

37-86 

Mean 

51 

50 

49 

55 

45 

58 

126 

Females 

(50%) 

112(49%) 

69  (64%) 

10  (24%) 

19  (39%) 

14  (47%) 

Age  Range 

21-86 

21-86 

21-82 

46-84 

31-72 

37-86 

Mean 

52 

52 

49 

57 

51 

61 

128 

Males 

(50%) 

116(51%) 

38  (36%) 

32  (76%) 

30  (61%) 

16  (53%) 

Age  Range 

14-82 

14-82 

19-78 

34-82 

14-70 

38-70 

Mean 

50 

49 

48 

54 

41 

55 

-ABD  cagA 

200*  (85%) 

87**  (81%) 

38  (90%) 

45***  (83%) 

30  (100%) 

Age  Range 

14-86 

19-78 

34-84 

14-72 

37-86 

Mean 

51 

49 

55 

44 

58 

Females 

90  (46%) 

57  (66%) 

8  (21%) 

1 1  (28%) 

14  (47%) 

Age  Range 

21-86 

21-75 

46-84 

36-72 

37-86 

Mean 

52 

49 

58 

52 

61 

Males 

104  (54%) 

29  (34%) 

30  (79%) 

29  (73%) 

16  (53%) 

Age  Range 

14-82 

19-78 

34-82 

14-70 

38-70 

Mean 

49 

48 

53 

42 

55 

All  Other  genotypes 

34  (15%) 

21  (19%) 

4  (10%) 

9(17%) 

0  (0%) 

Age  Range 

28-82 

28-82 

48-81 

31-72 

0 

Mean 

50 

49 

61 

49 

0 

Females 

22  (65%) 

12  (57%) 

2  (50%) 

8  (89%) 

0 

Age  Range 

28-82 

28-82 

48-56 

31-72 

0 

Mean 

51 

50 

52 

51 

0 

Males 

12 

9  (43%) 

2  (50%) 

1  (11%) 

0 

Age  Range 

33-81 

36-61 

58-81 

33 

0 

Mean 

49 

46 

70 

N/A 

0 

*6  w/o  age  or  sex  information 
**One  w/o  age 
***5  w/o  age 


LtJ 

OO 


u> 
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50.9%,  respectively.  The  remaining  26  strains,  which  all  came  from  non-cancer  patients, 
failed  to  yield  PCR  products  or  gave  incorrectly  sized  bands,  and  thus  were  not  further 
analyzed. 

To  confinn  the  cagA  genotyping  results,  we  sequenced  the  C-terminal  region  of 
the  cagA  gene  from  47  of  the  234  genotyped  strains.  These  sequences  verified  that  the 
PCR  genotyping  method  was  accurate.  Alignments  of  the  predicted  amino  acid 
sequences  of  those  strains  carrying  the  -ABD  motif  can  be  found  in  Fig.  4C. 

Of  the  234  genotyped  strains,  208  isolates  (88.9%)  carried  an  EPIYA-D  motif,  therefore 
classifying  them  as  East  Asian,  and  26  isolates  (11.1%)  were  determined  to  carry 
Western  CagA  (see  Table  SI  in  the  supplemental  material).  Among  the  East  Asian 
strains,  eight  carried  an  incomplete  -ABD  motif  or  contained  additions  of  one  or  more 
motifs  (EPIYA-AABD,  -BD,  -BBD,  -ABAB*D,  and  -AB*D,  with  a  mutation  within  the 
EPIYA-B  motif  as  designated  by  the  asterick),  and  thus,  we  subdivided  the  strains  based 
on  the  presence  of  CagA  containing  a  complete  EPIYA-ABD  motif  versus  all  other 
EPIYA  motifs.  Given  these  characteristics,  34  individuals  were  detennined  to  have 
“other  genotypes,”  which  includes  alternative  -ABD  as  well  as  Western  motifs.  We  next 
analyzed  the  distribution  of  cagA  genotype  among  disease  states.  There  were  108 
gastritis  patients,  54  duodenal  ulcer  patients,  42  gastric  ulcers  patients,  and  30  gastric 
cancer  patients.  EPIYA-ABD  CagA  composed  80.6%  of  gastritis  patients,  83.3%  of 
duodenal  ulcer  patients,  90.5%  of  gastric  ulcer  patients  and  100%  of  gastric  cancer 
patients.  Stratification  of  the  patients  based  on  age,  sex  and  disease  categories  can  be 
found  in  Table  2,  and  a  schematic  depiction  of  the  distribution  of  the  cagA  genotypes 
stratified  by  disease  state  within  this  Korean  population  can  be  found  in  Fig.  5. 
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Figure  4:  Genotyping  of  the  cagA  variable-EPIYA  motif.  A.  Schematic  representation  of 
the  cagA  variable  region.  Western  CagA  (EPIYA-ABC)  is  depicted  on  the  top,  and  East 
Asian  CagA  (EPIYA-ABD)  is  depicted  on  the  bottom.  The  annealing  positions  (small 
arrows),  names  of  the  primers  used  in  this  study,  and  the  expected  sizes  (large  arrows)  of 
the  amplified  specific  EPIYA  motif  products  as  based  on  strain  K3-CA  DNA  sequence 
are  shown.  Primer  names  are  abbreviated  as  follows:  cagA28F  is  A28F,  cag2  is  2,  cagA- 
P1C  is  PIC,  cagA-pAl  (R)  is  A,  cagA-P2TA  is  P2TA,  cagA-P2CG  is  P2CG,  cagA-P3E 
is  P3E,  and  cagA-pD  (R)  is  D.  B.  PCR  amplicons  of  K49-DU  and  K3-CA  using  the 
forward  primer  2  and  the  reverse  primer  A,  P2TA  and  P2CG  (equimolar  mixture),  P3E, 
or  D.  “M”  designates  the  size  markers  (in  base  pairs).  Type  indicates  resulting  EPIYA 
motif  identified.  C.  Amino  Acid  alignment  of  the  carboxy  terminus  of  CagA  from  29 
Korean  strains  encoding  the  EPIYA-ABD  motif.  The  EPIYA-A,  -B  and  -D  motifs  are 
indicated  below  the  consensus  sequence. 
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Figure  4:  Genotyping  of  the  cagA  variable-EPIYA  motif 
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Even  though  this  collection  was  evenly  distributed  for  the  factors  of  age  and 
gender,  several  trends  for  each  were  observed  based  on  statistical  analysis.  Not 
surprisingly,  age  is  statistically  linked  to  disease  state  (P<0.001).  Moreover,  as  has  been 
suggested  in  other  studies  (38),  males  were  more  likely  to  have  ulcers  (odds  ratio,  3.89; 
confidence  interval,  .81  to  8.36),  whereas  females  were  more  likely  to  have  gastritis 
(odds  ratio,  1.91;  confidence  interval,  1.91  to  5.67).  Conversely,  the  cancer  patients  were 
evenly  distributed  by  gender,  with  46.7%  being  female  and  53.3%  being  male.  This 
differs  from  what  is  most  often  seen  in  literature,  which  shows  that  men  are  anywhere 
from  1.5  to  2.5  times  more  likely  to  be  afflicted  with  gastric  cancer  than  women  (as 
reviewed  in  35).  A  significant  three-way  association  was  also  observed  using  log  linear 
modeling  between  gender,  disease,  and  cagA  allele  (P=0.009). 

Given  the  fact  that  100%  of  the  gastric  cancer  patients  were  infected  with  H. 
pylori  encoding  CagA  with  the  EPIYA-ABD  motif,  we  conducted  statistical  analysis  to 
assess  the  relationship  between  disease  state  and  genotype.  The  Fisher’s  exact  test 
showed  that  the  proportion  of  patients  with  the  EPIYA-ABD  genotype  varied 
significantly  (P=0.022)  by  diagnosis  (Fig.  5).  In  fact,  the  proportion  of  cancer  patients 
with  the  EPIYA-ABD  genotype  (100%)  was  significantly  higher  than  the  proportion  in 
gastritis  patients  (80.6%;  P=0.004)  or  duodenal  ulcer  patient  (83.3%;  P=  0.014),  but  not 
in  gastric  ulcer  patients  (90.5%;  P=0. 109;  Fig.  5).  Taken  together,  these  data  suggest 
that  there  is  a  definitive  link  between  infection  with  H.  pylori  strains  carrying  cagA 
which  encodes  the  EPIYA-ABD  motif  and  the  development  of  gastric  cancer. 
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Figure  5:  Schematic  depiction  of  the  distribution  of  the  cagA  genotypes  stratified  by 
disease  state  within  this  Korean  population.  Distribution  of  cagA  genotypes,  EPIYA- 
ABD  versus  all  other  EPIYA  motifs,  within  the  four  different  disease  states:  gastritis, 
gastric  ulcers,  duodenal  ulcers,  and  cancer.  The  shaded  portion  within  the  other 
genotypes  subgrouping  corresponds  to  the  isolates  that  contain  an  alternative  EPIYA- 
ABD  motif.  Calculated  P  values,  using  the  Fisher’s  exact  test  are  shown. 
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Figure  5:  Schematic  depiction  of  the  distribution  of  the  cagA  genotypes  stratified  by 
disease  state  within  this  Korean  population 


Other  Genotypes  EPIYA-ABD 


146 


CagA  Protein  Expression 

Given  the  fact  that  we  saw  a  significant  statistical  link  between  the  presence  of 
the  EPIYA-ABD  genotype  and  gastric  cancer  but  that  some  strains  that  carry  cagA  do  not 
actually  express  the  CagA  protein  (25),  we  next  sought  to  determine  if  genotypically 
cagA+  strains  were  phenotypically  CagA  positive.  Bacterial  lysates  from  a  subset  of  77 
randomly  chosen  strains  were  assessed  for  expression  of  CagA.  Of  the  77  isolates 
examined  using  a  monoclonal  antibody,  four  samples  (K19-CA,  K82-G,  K255-G,  and 
K264-DU)  showed  no  appreciable  CagA  expression  (Fig.6A  and  data  not  shown).  Given 
the  fact  that  CagA  shows  heterogeneity  in  the  carboxy  terminus  that  may  affect  protein 
structure  and  monoclonal  antibody  recognition,  we  also  utilized  a  polyclonal  anti-CagA 
antibody  to  ensure  that  CagA  was  actually  not  expressed  in  these  strains.  As  shown  in 
Figure  6 A,  the  polyclonal  antibody  was  better  able  to  detect  CagA  in  the  majority  of 
strains.  This  included  K19-CA  for  which  CagA  was  not  detected  with  the  monoclonal 
antibody.  Using  this  assay,  three  of  the  77  strains  (K82-G,  K255-G,  and  K264-DU) 
expressed  no  detectable  level  of  CagA  (Table  3  and  data  not  shown). 

Delivery  and  Phosphorylation  of  CagA 

Once  CagA  is  expressed,  it  must  be  delivered  to  host  cells  via  the  type  IV 
secretion  apparatus  and  phosphorylated  by  host  cell  kinases  to  be  biologically  active  (40). 
Therefore,  we  next  conducted  phosphorylation  assays  to  determine  if  CagA  could  be 
delivered  to  and  phosphorylated  in  host  cells.  Of  the  77  strains  tested,  59  of  the  isolates 
efficiently  delivered  CagA  to  the  host  cells  as  detected  by  the  appearance  of  a  strongly 
phosphorylated  CagA  band  (Fig.  6B  and  Table  3).  Of  the  remaining  18  isolates,  an 
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Figure  6:  Expression,  Delivery,  and  Phosphorylation  of  CagA.  A.  Western  blot  analysis 
of  bacterial  lysates  from  the  six  indicated  Korean  strains  was  conducted  using  a 
monoclonal  anti-CagA  antibody  (top),  or  the  polyclonal  anti-CagA  antibody,  B300 
(bottom).  B.  Lysates  from  the  bacterial  cells  alone  (BL)  and  AGS  cells  infected  with  the 
same  bacterial  strain  (ICL)  were  assessed  for  delivery  and  phosphorylation  of  CagA. 
Membranes  were  probed  with  an  anti-phosphotyrosine  antibody,  pYlOO  (top),  stripped, 
and  subsequently  reprobed  with  polyclonal  anti-CagA  antibody,  B300  (bottom).  Data  are 
shown  from  the  positive  control,  G27-MA,  and  three  indicated  Korean  isolates. 
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Figure  6:  Expression,  Delivery,  and  Phosphorylation  of  CagA 
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intermediate  level  of  phosphorylated  CagA  was  detected  for  13  strains,  and  no  detectable 
phosphorylated  CagA  was  found  for  five  strains  (K82-G,  K1 1 1-DU,  K123-G,  K255-G, 
and  K264-DU;  Table  3).  Importantly,  the  three  strains  shown  to  be  negative  for  CagA 
expression  were  included  among  these  five,  and  K19-CA,  which  was  only  detected  with 
the  polyclonal  CagA  antibody,  was  positive  for  phosphorylation. 

Cell  Elongation  Assay 

Upon  injection  of  CagA  into  host  cells,  it  becomes  phosphorylated  and  causes 
striking  host  cell  elongation,  which  is  known  as  the  “hummingbird”  phenotype  (40). 
Thus,  to  reassess  the  presence  of  functional  CagA  in  the  18  isolates  that  produced  either 
an  intermediate  phenotype  or  no  detectable  level  of  phosphorylated  CagA,  the  ability  to 
induce  the  “hummingbird”  phenotype  was  assessed  in  cultured  AGS  cells.  A  wild-type 
strain,  G27-MA,  and  its  isogenic  A cagA  mutant  were  used  as  positive  and  negative 
controls,  respectively.  The  percentage  of  cells  displaying  the  “hummingbird”  phenotype 
for  the  G27-MA  A  cagA  infected  cells  was  23%  and  for  the  G27-MA-infected  cells  was 
80%  (Fig.  7  A  and  B).  The  range  for  the  18  Korean  isolates  was  between  31.5%  (K82-G) 
and  79%  (K25-DU;  Fig.  7C  and  D;  Table  3).  Given  the  large  range  of  changes,  we 
conservatively  required  that  a  strain  induce  at  least  60%  of  AGS  cells  to  display  the 
“hummingbird”  phenotype  to  be  considered  positive  for  delivery  of  functional  CagA. 
Four  of  the  18  samples  tested  did  not  meet  this  threshold  (K82-G,  K123-G,  K255-G,  and 
K264-DU).  These  samples  also  showed  no  detectable  level  of  phosphorylated  CagA  via 
the  phosphorylation  assay. 
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Table  3:  Analysis  of  CagA  Expression  and  Function 


Strain 

EPIYA  Motif 

CagA  Expression 

Phosphorylation  of 
CagAa 

Induction  of 

"Hummingbird"  Phenotypeb 

IL-8  Induction1 

K82-G 

ABD 

- 

- 

- 

- 

K255-G 

ABD 

- 

- 

- 

- 

K264-DU 

ABD 

- 

- 

- 

+/- 

Klll-DU 

ABD 

+ 

- 

+ 

K123-G 

ABD 

+ 

- 

- 

- 

K17-CA 

ABD 

+ 

+/- 

+ 

+ 

K26-DU 

ABD 

+ 

+/- 

+ 

+ 

K208-G 

ABD 

+ 

+/- 

+ 

+ 

K21-CA 

ABD 

+ 

+/- 

+ 

K23-DU 

ABD 

+ 

+/- 

+ 

K25-DU 

ABD 

+ 

+/- 

+ 

K42-DU 

ABD 

+ 

+/- 

+ 

K104-CA 

ABD 

+ 

+/- 

+ 

K182-DU 

ABD 

+ 

+/- 

+ 

K193-G 

ABD 

+ 

+/- 

+ 

K238-DU 

ABD 

+ 

+/- 

+ 

K248-G 

ABD 

+ 

+/- 

+ 

K259-CA 

ABD 

+ 

+/- 

+ 

K3-CA 

ABD 

+ 

+ 

K6-CA 

ABD 

+ 

+ 

K10-CA 

ABD 

+ 

+ 

K16-CA 

ABD 

+ 

+ 

K19-CA 

ABD 

+ 

K28-DU 

ABD 

+ 

K34-DU 

ABD 

+ 

K35-DU 

ABD 

+ 

K36-DU 

ABD 

+ 

K37-DU 

ABD 

+ 

K41-DU 

ABD 

+ 

K43-DU 

ABD 

+ 

K44-DU 

ABD 

+ 

K45-DU 

ABD 

+ 

K46-DU 

ABD 

+ 

K47-DU 

ABD 

+ 

K48-DU 

ABD 

+ 

K57-G 

ABD 

+ 

K60-G 

ABC 

+ 

K64-G 

ABCC 

+ 

K74-G 

ABD 

+ 

K77-G 

ABD 

+ 

K78-G 

AABD 

+ 

K80-CA 

ABD 

+ 

K85-G 

BD 

+ 

K93-DU 

ABC 

+ 

K107-DU 

ABD 

+ 

K109-G 

ABD 

+ 

K112-G 

ABD 

+ 

K113-G 

ABD 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 


+ 


K115-G 

ABC 

+ 

+ 

K117-G 

ABD 

+ 

+ 

K131-G 

ABD 

+ 

+ 

K162-G 

ABD 

+ 

+ 

K172-G 

ABCC 

+ 

+ 

K175-G 

ABD 

+ 

+ 

K178-G 

ABD 

+ 

+ 

K183-G 

ABD 

+ 

+ 

K185-G 

ABD 

+ 

+ 

K196-G 

ABD 

+ 

+ 

K197-G 

ABD 

+ 

+ 

K209-G 

ABD 

+ 

+ 

K218-G 

ABD 

+ 

+ 

K220-DU 

ABD 

+ 

+ 

K223-G 

ABD 

+ 

+ 

K235-G 

ABD 

+ 

+ 

K241-G 

ABD 

+ 

+ 

K258-CA 

ABD 

+ 

+ 

K260-CA 

ABD 

+ 

+ 

K261-CA 
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“Phosphorylation  of  CagA:  ++  near  or  above  G27-MA  (positive  control),  +  slightly  below  the  level  of  G27-MA,  +/-  below  the  level 
of  G27-MA  but  above  background,  -  not  detectable. 

induction  of  Hummingbird  Phenotype:  +  greater  than  60%  of  cells  displayed  the  hummingbird  phenotype,  -  less  than  60%  of  cells 
displayed  the  hummingbird  phenotype. 

induction  of  IL-8:  +  induction  greater  than  10-fold  increase  over  G27-MA  APAI,  -  induction  less  than  10-fold  increase  over  G27- 
MA  APAI 

*  -ABABD  second  -B  motif  is  replaced  with  leucine,  ELIYA 


**  -B  motifs  proline  is  replaced  with  a  serine,  ESIYA 
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Induction  ofIL-8 

Since  several  isolates  were  identified  that  did  not  express  any  detectable  level  of 
functional  CagA  as  measured  by  the  phosphorylation  assay  and  induction  of  the 
“hummingbird”  phenotype,  we  finally  assessed  assembly  of  the  type  IV  secretion  system 
on  the  bacterial  surface  in  this  subset  of  strains.  Proper  assembly  of  the  type  IV  secretion 
system  has  been  shown  to  result  in  the  induction  of  IL-8  in  cultured  AGS  cells  (11). 
Therefore  we  assessed  IL-8  induction  with  the  four  strains  that  failed  to  produce 
phosphorylated  CagA  and  failed  to  induce  the  “hummingbird”  phenotype.  Additionally, 
as  a  positive  control,  we  analyzed  several  strains  that  did  induce  the  “hummingbird” 
phenotype.  One  (K264-DU)  out  of  the  four  “hummingbird”  phenotype-negative  samples 
induced  IL-8  (at  least  10  fold  above  the  level  induced  by  G27-MA  APAI)  indicating  that 
the  failure  to  detect  CagA  and  phosphorylated  CagA  or  to  induce  the  “hummingbird” 
phenotype  was  not  impacted  by  a  lack  of  a  functional  type  IV  secretion  system  (Fig.  7E 
and  Table  3). 


Discussion 

Herein  we  show  a  significant  statistical  link  between  the  presence  of  the  CagA 
EPIYA-ABD  motif  and  development  of  gastric  cancer.  In  fact,  100%  of  gastric  cancer 
patients  analyzed  in  this  South  Korean  population  were  infected  with  H.  pylori  strains 
encoding  CagA  containing  the  EPIYA-ABD  motif.  Statistical  analysis  with  the  Fisher 
exact  test  showed  that  the  proportion  of  EPIYA-ABD  genotype  varied  significantly  by 
diagnosis  (P=0.022),  and  that  this  distribution  was  statistically  different  than  that  of 
gastritis  patients  (P=0.004)  or  duodenal  ulcer  patients  (P=0.014;  Fig.  5). 
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Figure  7:  Morphological  Changes  and  Induction  of  IL-8.  Induction  of  morphological 
changes  in  AGS  cells  when  measured  after  9  hours  of  infection  with  the  following 
strains:  G27-MA  (positive  control)  (A),  G27-MA  A cagA  (negative  control)  (B),  K25-DU 
(C),  and  K82-G  (D).  E.  Induction  of  IL-8  from  the  indicated  strains  expressed  as 
increase  above  the  induction  elicited  by  G27-MA  APAI  (negative  control)  after  a  five- 
hour  infection. 
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Figure  7:  Morphological  Changes  and  Induction  of  IL-8 
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These  data  suggest  that  the  distribution  of  alleles  is  not  random  and  is  important  in  the 
case  of  gastric  cancer.  While  on  a  whole,  the  presence  of  the  cagA  gene  did  not  strictly 
correlate  to  the  expression  and  delivery  of  CagA  to  host  cells,  all  of  the  analyzed  cancer 
strains  did  express  a  functional  CagA  that  could  be  delivered  to  and  phosphorylated  in 
host  cells.  This  is  the  first  time  that  a  specific  cagA  allele  has  been  statistically  linked  to 
gastric  cancer.  However,  it  should  be  noted  that  the  EPIYA-ABD  allele  us  not 
necessarily  a  predictor  of  cancer  since  there  was  a  high  percentage  of  peptic  ulcer 
patients,  both  gastric  (90%)  and  duodenal  (83%),  infected  with  isolates  containing 
EPIYA-ABD  CagA.  Alternatively,  these  data  could  indicate  that  patients  infected  with 
H.  pylori  containing  non  EPIYA-ABD  motifs  are  more  likely  not  to  develop  cancer. 

This  association  between  the  East  Asian  cagA  genotype  and  gastric  cancer  may 
be  due  to  higher  affinity  for  SHP-2  (25).  Binding  of  CagA  with  SHP-2  occurs  via 
interaction  of  the  phosphorylated  EPIYA  motifs  and  the  SH2  domains  from  the  host  cell 
protein.  This  interaction  changes  the  conformation  of  SHP-2  to  its  active  form.  Thus,  the 
stronger  affinity  of  East  Asian  CagA  for  SHP-2  results  in  longer  periods  of  SHP-2 
activity.  This  likely  explains  why  East  Asian  strains  cause  greater  morphological  damage 
and  greater  level  of  induction  of  multiple  cellular  pathways,  resulting  in  increased 
proliferation,  morphogenesis,  and  cell  motility  than  Western  strains  (23-25,  29,  36,  46). 

While  we  did  isolate  eight  EPIYA-AABD,  -BD,  -BBD,  -ABAB*D,  and  -AB  D 
motifs,  the  vast  majority  of  East  Asian  strains  we  examined  showed  strong  conservation, 
and  lack  of  duplication  in  the  EPIYA-D  region.  This  suggests  that  variation  in  the  East 
Asian  cagA  is  not  as  favorable  as  in  Western  isolates  where  the  EPIYA-C  motif  is  found 
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to  vary  widely  among  isolates  (5,  7).  Similar  results  have  been  seen  by  Argent  et  al.,  who 
showed  by  sequence  analysis  of  500  East  Asian  strains  available  in  GenBank  that  the 
percentage  of  other  East  Asian  alleles  compared  to  those  coding  for  EPIYA-ABD  CagA 
was  fairly  small:  88.3%  of  East  Asian  strains  contained  an  EPIYA-ABD  CagA(5).  Also, 
it  is  interesting  that  all  of  the  cancer  strains  were  specifically  EPIYA-ABD,  which 
suggests  that  among  East  Asian  isolates  this  combination  of  EPIYA  motifs  is  most 
favored  for  cancer  development.  It  should  be  noted  that  in  a  study  of  Japanese  cancer 
patients,  all  H.  pylori  isolates  contained  the  EPIYA-D  motif,  and  the  majority  of  those 
isolates  (84%)  contained  an  EPIYA-ABD  motif  (10).  Alternatively,  the  other  16%  were 
made  up  of  isolates  carrying  the  EPIYA- AABD,  -ABBD,  -  ABABD,  and  -ABDBD 
motifs  (10). 

The  reason  for  the  EPIYA-ABD  conservation  is  unknown,  but  perhaps  a  single 
EPIYA-D  motif  allows  for  optimal  SHP-2  binding.  The  presence  of  extra  motifs  may 
contort  CagA’s  confonnation  and  destabilize  binding  to  SHP-2.  Additionally,  it  is  known 
that  phosphorylated  CagA  at  EPIYA-A  and  -B  motifs  binds  to  Csk  and  activates  a 
negative  feedback  loop  that  inactivates  the  Src  family  kinases,  and  ultimately  reduces  the 
level  of  phosphorylated  CagA  in  the  cell  (45).  Thus,  it  is  reasonable  to  suggest  that  the 
presence  of  additional  EPIYA-A  or  -B  motifs  in  association  with  EPIYA-D  motif  would 
more  strongly  activate  this  negative  feedback  loop  (5).  In  support  of  the  importance  of 
conservation  of  the  EPIYA-ABD  motif  in  the  disease  state,  seven  out  of  the  eight  isolates 
containing  a  EPIYA-D  motif  but  not  a  complete  standard  EPIYA-ABD  motif  only  caused 
gastritis. 
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Multiple  host,  dietary,  environmental,  and  bacterial  virulence  factors  have  been 
shown  to  play  a  role  in  H.  pylori- induced  disease.  In  this  molecular  epidemiologic  study, 
we  have  shown  a  definitive  statistical  difference  in  the  distribution  of  the  cagA  allele 
coding  for  the  EPIYA-ABD  motif  in  cancer  versus  other  disease  states;  100%  of  the 
cancer  patients  were  infected  with  H.  pylori  strains  carrying  the  EPIYA-ABD  genotype. 
Currently  the  reason  for  this  correlation  is  unclear,  and  further  study  is  required  to 
elucidate  the  molecular  role  that  the  EPIYA  motif  plays  in  cancer  development. 
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Figure  SI:  Genotyping  of  the  complex  cagA  variable-EPIYA  motifs.  A.  Schematic 
representation  of  the  variable  region  of  K30-DU  CagA  and  K263-G  CagA  is  depicted  on 
the  left  and  right,  respectively.  The  annealing  positions  (small  arrows)  and  names  of  the 
primers  used  in  this  study  and  the  expected  sizes  of  the  amplified  specific  EPIYA  motif 
products  according  to  the  DNA  sequence  are  shown.  B.  PCR  amplicons  of  K30-DU  and 
K263-G  are  shown  using  the  forward  primer  2  and  the  reverse  primers  A,  P2TA  and 
P2CG  (equimolar  mixture),  P3E,  or  D.  M  designates  the  size  markers  (in  base  pairs). 
Type  indicates  resulting  EPIYA  motif  identified. 
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Figure  SI:  Genotyping  of  the  complex  cagA  variable  EPIYA  motifs 


A 


A28F 


2 


K30-DU 

A28F  2 

A  B  C  C  C  EPIYA  motif  -►  -► 


K263-G 

A  B  A  B  D  EPIYA  motif 


651  bp 


2000 

1500 

1000 

750 

500 

250 


D 

Motif:  A  B  C/D  D 
Type:  A  B  A  B  D 


>03  >03 


Chapter  Three 


An  Epidemiological  Link  Between  Gastric  Disease  and  Polymorphisms  in  VacA  and 

CagA 
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The  work  presented  in  this  chapter  is  the  sole  work  of  K.  R.  Jones  with  the 
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I.S.  Chung  perfonned  the  biopsies  and  supplied  the  diagnoses,  and  C.H.  Olsen  assisted 
with  the  statistical  analysis. 


Abstract 

Gastritis,  peptic  ulcer  disease  and  gastric  cancer  are  a  few  of  the  diverse  disease 
manifestations  that  have  been  shown  to  be  associated  with  infection  by  Helicobacter 
pylori.  Why  some  individuals  develop  more  severe  fonns  of  disease  remains  largely 
unknown.  In  this  study,  225  South  Korean  strains  were  genotyped  for  vacA  and  then 
analyzed  to  determine  if  particular  genotypes  varied  across  disease  state,  sex,  or  cagA 
allele.  Of  these  strains,  206  strains  carried  an  sl/il/ml  allele,  1 1  strains  carried  an 
sl/il/m2  allele,  and  8  strains  carried  an  sl/i2/m2  allele.  By  using  Fisher’s  exact  test,  a 
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statistical  association  between  variations  in  the  cagA  and  vacA  alleles  was  identified  (P= 
0.0007),  and  by  using  log  linear  modeling,  this  variation  was  shown  to  affect  the  severity 
of  disease  outcome  (P=  0.027).  Additionally,  we  present  evidence  that  variation  within 
the  middle  region  of  VacA  contributes  significantly  to  the  distribution  of  vacA  alleles 
across  gender  (P=0.008)  as  well  as  the  association  with  disease  outcome  (P=  0.01 1).  In 
this  South  Korean  population,  the  majority  of  H.  pylori  strains  carry  the  vacA  sl/il/ml 
allele  and  the  CagA  EPIYA-ABD  allele.  These  facts  may  contribute  to  the  high  incidence 
of  gastric  maladies,  including  gastric  cancer. 

Introduction 

Helicobacter  pylori  is  a  Gram-negative  bacterium  (27)  that  chronically  infects 
the  gastric  mucosa  of  over  half  of  the  world’s  population  (28,  50)  and  is  associated  with 
the  development  of  chronic  gastritis,  gastric  and  duodenal  ulcers,  and  gastric 
adenocarcinoma  and  mucosa  associated  lymphoid  tissue  (MALT)  lymphoma  (6,  10,  14, 
40).  Given//,  pylori’ s  high  prevalence,  chronic  persistence,  and  its  link  to  gastric  cancer, 
it  is  no  wonder  that  gastric  cancer  is  the  second  most  common  cause  of  cancer  associated 
death  (35)  with  the  mortality  rate  being  especially  high  in  East  Asian  countries  such  as 
China,  Japan,  and  Korea  (18). 

H.  pylori  strains  express  various  toxins  that  enable  the  bacteria  to  cause  host  cell 
damage  and  interact  with  the  host  immune  response.  Included  among  these  toxins  are  the 
cytotoxin  associated  gene  A  (CagA)  and  the  vacuolating  cytotoxin  (VacA;  33).  CagA 
has  emerged  as  a  major  contributor  to  disease  severity,  and  there  is  a  direct  link  between 
presence  of  CagA  and  increased  cancer  risk  (7,  16).  CagA  induces  various  pathologic 
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changes  by  modulating  host  cell  signaling  pathways,  primarily  after  tyrosine 
phosphorylation  at  the  EPIYA  motif  (17,  19-21,  34,  44,  54).  Interestingly,  CagA  is 
polymorphic,  and  the  distribution  of  EPIYA  motif  combinations  differs  geographically. 

VacA  is  another  important  toxin  that  is  produced  and  secreted  by  all  H.  pylori 
strains  (4,  1 1),  and  was  previously  shown  to  have  various  modes  of  action  (12,  15,  26,  33, 
38,  48,  49,  52,  56).  Like  CagA,  VacA  has  been  shown  to  contain  a  number  of 
polymorphisms.  Currently,  three  polymorphic  regions  of  vacA  have  been  identified: 
signal  (s),  intermediate  (i),  and  middle  (m)  regions.  Each  of  these  polymorphic  regions 
has  two  main  types  that  divide  them  further  into  type  1  and  type  2  (3,  42).  The  s  region 
encodes  the  N-terminal  signal  sequence  (29,  41),  and  polymorphisms  in  the  s  region 
affect  anion  channel-forming  efficiency  of  the  toxin  (29);  the  si  type  has  an  increased 
ability  to  fonn  membrane  channels  (29).  Polymorphisms  in  the  m  region  affect  the  cell 
tropism  of  the  toxin  (22);  the  ml  type  of  VacA  shows  toxicity  to  a  broader  range  of  cells 
than  the  m2  type  (1,  37).  The  i  region,  which  is  located  between  the  s  and  m  regions  and 
also  displays  two  main  polymorphisms  (42).  The  il  type  of  VacA  has  stronger 
vacuolating  activity  than  the  i2  type  (42).  Individually,  the  si,  il,  and  ml  types  have 
been  shown  to  be  associated  with  more  severe  forms  of  H.  pylori  induced  disease  (5,  42). 

Recently,  we  presented  molecular  epidemiologic  evidence  that  there  is  a 
significant  association  between  the  development  of  gastric  cancer  and  infection  with  H. 
pylori  strains  carrying  the  EPIYA- ABD  cagA  genotype  in  South  Korea,  which  has  one  of 
the  highest  rates  of  H.  pylori  colonization  (51)  and  one  of  the  highest  rates  of  gastric 
cancer  in  the  world  (16,  46).  Given  the  mounting  body  of  evidence  that  indicates  that 
cagA  and  vacA  interact,  herein  we  assess  vacA  polymorphisms  across  various  cagA 
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alleles  and  in  relation  to  disease  development  and  we  show  a  significant  3 -way 
association  between  vacA,  cagA,  and  disease. 

Material  and  Methods 
Bacterial  Strains  and  Culture  Conditions 

The  South  Korean  H.  pylori  clinical  isolates  used  in  this  study  were  previously 
described  in  Jones  et  al.  (23),  and  included  115  gastritis  isolates,  55  gastric  ulcer  isolates, 
54  duodenal  ulcer  isolates,  and  30  gastric  cancer  isolates  with  epidemiological  data  on 
age  and  gender.  H.  pylori  stocks  preserved  at  -80°C  were  grown  and  expanded  on 
antibiotic-supplemented  horse  blood  agar  plates  under  microaerophilic  conditions  created 
by  an  Anoxomat  evacuation/replacement  system  (Spiral  Biotech,  Norwood,  MA),  as 
previously  described  (9,  23). 

vacA  Genotyping 

The  primers  used  for  vacA  genotyping  and  sequencing  of  the  i  region  are  listed 
in  Table  4.  Chromosomal  DNA  of  all  254  H.  pylori  strains  was  isolated  using  the  Easy- 
DNA  kit  (Invitrogen,  Carlsbad,  CA).  Four  individual  PCRs  were  performed  to  identify 
the  vacA  genotype  of  each  strain  (Fig.  8).  The  s  region  was  identified  by  amplifying  with 
primers  VA1-F  and  VA1-R.  The  si  region  produced  a  259  base  pair  (bp)  amplicon 
where  as  the  s2  region  produced  a  286  bp  amplicon  (3).  The  ml  and  m2  regions  were 
detennined  by  amplifying  with  primers  VAG-F  and  VAG-R  yielding  a  567  bp  and  642 
bp  product,  respectively  (4).  The  i  region  was  genotyped  using  two  independent  PCR 
reactions  using  a  universal  forward  primer  (VacFl)  and  different  i  region  type-specific 
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Table  4:  Primer  sequences 


Primer 

Sequence  (5 ’-3’) 

Reference 

VA1-F" 

ATGGAAATACAACAAACACAC 

(3) 

VA1-R" 

CTGCTT  GAAT  GCGCCAAAC 

(3) 

VAG-F* 

C  AATCTGTCC  AAT  C  AAGCGAG 

(4) 

VAG-R* 

GCGTCAAAATAATTCCAAGG 

(4) 

ClRc 

TT  AATTTA  ACGCT  GTTT  GAAG 

(42) 

C2RC 

GATCAACGCTCTGATTTGA 

(42) 

VacFlc,c/ 

GTTGGGATTGGGGGAATGCCG 

(42) 

YacR9d 

TGTTTATCGTGCTGTATGAAGG 

(42) 

a  This  primer  pair  was  used  to  genotype  the  s  region. 
h  This  primer  pair  was  used  to  genotype  the  m  region. 
c  These  primers  were  used  to  genotype  the  i  region. 
d  These  primers  were  used  for  the  i  region  sequencing. 
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Figure  8:  Genotyping  of  the  vacA  polymorphic  regions.  (Top)  Schematic  representation 
of  the  vacA  alleles,  where  an  sl/il/ml  allele  is  shown  on  the  top  and  an  s2/i2/m2  allele  is 
depicted  on  the  bottom.  The  annealing  positions  (arrows)  and  names  of  the  primers  used 
in  this  study  are  shown.  (Bottom)  PCR  amplicons  of  each  polymorphic  region  using  the 
primers  listed  above  the  gel  are  depicted.  The  approximate  size  of  the  amplicon  is  listed 
below  each  band. 
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Figure  8:  Genotyping  of  the  vacA  polymorphic  regions 

s  region  i  region  m  region 

l - 1  l - 1  l - 


VA1-F  VacFI  VAG-F 


VA1-R  C2R  VAG-R 


si  s2  il  i2  ml  m2 

Forward  Primer  VA1-F  VacFI  VacFI  VAG-F 

Reverse  Primer  VA1  -  R  C1R  C2R  C1R  C2R  VAG-R 

(bp)  750 
500 

250 

Approx,  size  (bp)  259  286  410  410  567  642 
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reverse  primers  (C 1R  and  C2R)  as  described  by  Rhead  et  al.  (42).  C 1R  and  C2R 
specifically  anneal  with  the  il  and  i2  vacA  allele,  respectively  (42). 

Sequencing  of  60  isolates  from  patients  suffering  from  gastritis  (24  isolates), 
duodenal  ulcers  (10  isolates),  gastric  ulcers  (8  isolates),  and  cancer  (18  isolates)  was 
conducted  to  identify  specific  amino  acid  changes  in  the  il  allele.  Sanger  dideoxy 
sequencing  was  performed  at  the  Unifonned  Services  University  of  the  Health  Science 
Biomedical  Instrumentation  Center  (Bethesda,  MD).  The  resulting  DNA  sequences  were 
analyzed  using  Vector  NTI  version  9.1  (Invitrogen,  Carlsbad,  CA)  and  Sequencher  4.5 
(Gene  Codes  Corp.,  Ann  Arbor,  MI). 

Statistical  Analysis 

The  Fisher’s  exact  test  was  used  to  analyze  the  association  between  the  vac  A 
allele  and  disease  state  or  cagA  allele  (based  on  the  EPIYA  motif).  Log  linear  modeling 
was  used  to  assess  higher  order  associations.  We  fit  a  saturated  model  using  categorical 
variables  representing  vacA  genotype,  cagA  genotype,  disease  state,  gender,  and  age 
categories.  A  backward  selection  algorithm  identified  higher-order  associations  among 
these  variables,  which  were  statistically  significant  at  the  5%  level.  Data  were  analyzed 
using  SPSS  version  14  or  16  software  (SPSS  Inc.,  Chicago,  IL)  or  SAS  version  9.1 
software  (SAS  Institute  Inc.,  Cary,  NC). 
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Nucleotide  Sequence  Accession  Numbers 

The  sequences  for  the  il  region  of  vacA  from  60  strains  have  been  deposited  in 
GenBank  under  accession  numbers  GQ338184  to  GQ338243  (see  Table  S2  in  the 
supplemental  material). 


Results 

Sample  Acquisition/vacA  Genotyping 

The  strains  used  for  this  study  were  previously  used  for  characterization  of  the 
distribution  of  cagA  alleles  (23),  and  represent  260  strains  obtained  from  patients 
presenting  with  gastric  maladies;  254  of  those  strains  have  contain  complete 
epidemiological  data  (see  Table  S2  in  the  supplemental  material).  The  mean  patient  age 
was  5 1  years,  with  an  age  range  of  14  to  86  years  (Table  5).  Within  this  population 
49.6%  (126  patients)  were  female,  with  an  age  range  of  21  to  86  years  and  a  mean  age  of 
52  years,  and  50.4%  (128  patients)  were  male,  with  a  mean  age  of  50  years  and  an  age 
range  of  14  to  82  years.  Of  these  samples,  1 1.8%  were  from  patients  with  cancer,  42.9% 
were  from  patients  with  peptic  ulcer  disease  (21.7%  gastric  ulcers  and  21.3%  duodenal 
ulcers),  and  45.3%  were  from  patients  with  gastritis  (23). 

Four  different  PCRs  were  conducted  for  each  strain  in  order  to  genotype  vac  A 
(Fig.  8  and  see  Table  S2  in  the  supplemental  material).  The  s  region  was  identified  and 
differentiated  by  amplifying  with  the  VA1-F  and  VA1-R  primers  (3)  and  the  m  region 
was  determined  by  amplifying  with  the  primers  VAG-F  and  VAG-R  (4).  The  i  region 
was  genotyped  using  two  independent  PCR  reactions  using  a  universal  forward  primer 


Table  5:  vacA  genotyped  isolates  and  disease  state 1 


Disease  state  of  vacA  genotyped  isolates 


Total 

vacA  genotyped  Isolates 

Gastritis 

Gastric  Ulcer 

Duodenal  Ulcer 

Gastric  Cancer 

Overall  Total 

254 

225 

103 

43 

49 

30 

Age  Range 

14-86 

14-86 

19-82 

34-84 

14-72 

37-86 

Mean 

51 

50 

49 

55 

45 

58 

Females 

126 

111 

68 

10 

19 

14 

Age  Range 

21-86 

21-86 

21-82 

46-84 

31-72 

37-86 

Mean 

52 

52 

49 

57 

51 

61 

Males 

128 

114 

35 

33 

30 

16 

Age  Range 

14-82 

14-82 

19-78 

34-82 

14-70 

38-70 

Mean 

50 

49 

47 

54 

41 

55 

sl/il/ml 

206 

95 

41 

42 

28 

Age  Range 

14-86 

19-78 

34-84 

14-70 

37-86 

Mean 

50 

48 

55 

43 

58 

Females 

96 

61 

8 

14 

13 

Age  Range 

21-86 

21-75 

46-84 

31-61 

37-86 

Mean 

51 

49 

57 

48 

60 

Males 

110 

34 

33 

28 

15 

Age  Range 

14-82 

19-78 

34-82 

14-70 

38-70 

Mean 

49 

47 

54 

41 

56 

sl/il/m2 

11 

4 

1 

4 

2 

Age  Range 

38-82 

38-82 

63 

41-57 

46-68 

Mean 

54 

54 

N/A 

51 

56 

Females 

9 

4 

1 

3 

1 

Age  Range 

38-82 

38-82 

63 

48-57 

68 

Mean 

56 

54 

N/A 

54 

NA 

Males 

2 

0 

0 

1 

1 

Age  Range 

41-44 

0 

0 

41 

46 

Mean 

43 

0 

0 

N/A 

N/A 

sl/i2/m2 

8 

4 

1 

3 

0 

Age  Range 

38-72 

38-68 

56 

61-72 

0 

Mean 

57 

51 

N/A 

65 

0 

Females 

6 

3 

1 

2 

0 

Age  Range 

38-72 

38-68 

56 

61-72 

0 

Mean 

58 

53 

N/A 

67 

0 

Males 

2 

1 

0 

1 

0 

Age  Range 

43-61 

43 

0 

61 

0 

Mean 

52 

N/A 

0 

N/A 

0 

"Total  number  of  samples  only  includes  the  254  that  contained  complete  epidemiological  data  for  age  and  gender. 
■There  was  a  statistical  association  between  the  m  allele  and  gender  (P=0.0233). 
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(VacFl)  and  one  of  two  i  region  type-specific  reverse  primers  (C1R  for  the  il  type  and 
C2R  for  the  i2  type)  as  described  by  Rhead  et  al.  (42). 

The  distribution  of  vacA  polymorphisms  is  shown  in  Table  5.  Of  the  254  strains 
containing  complete  epidemiological  data,  225  strains  were  successfully  genotyped  for 
the  vacA  allele  (Table  5  and  see  Table  S2  in  the  supplemental  material).  The  strains  that 
were  not  successfully  genotyped  for  vacA  failed  to  yield  PCR  products  or  gave 
incorrectly  sized  bands  and  thus  were  not  further  analyzed.  The  genotyped  strains  were 
obtained  from  patients  with  a  mean  age  of  50  years  and  an  age  range  of  14  to  86  years. 
These  patients  included  111  females  (49.3%)  with  a  mean  age  of  52  years  and  an  age 
range  of  2 1  to  86  years,  and  114  males  (50.7%)  with  a  mean  age  of  49  years  and  an  age 
range  of  14  to  82  years.  Of  these  225  strains,  206  strains  (91.6%)  carried  an  sl/il/ml 
vacA  allele  (mean  patient  age  of  50  years  and  an  age  range  of  14  to  86  years).  Of  the 
strains  carrrying  the  sl/il/ml  allele,  96  (46.6%)  were  from  female  patients  (mean  age  of 
5 1  years  and  an  age  range  of  2 1  to  86  years),  and  110  (53.4%)  were  from  male  patients 
(mean  age  of  49  years  and  an  age  range  of  14  to  82  years).  Eleven  strains  (4.9%)  carried 
for  an  sl/il/m2  vacA  genotype  and  were  obtained  from  patients  with  a  mean  age  of  54 
years  and  an  age  range  of  38  to  82  years.  Of  these  strains,  9  (81.8%)  were  obtained  from 
female  patients  (mean  age  of  56  years  and  an  age  range  of  38  to  82  years),  and  2  (18.2%) 
were  obtained  from  male  patients  (mean  age  of  43  years  and  an  age  range  of  4 1  to  44 
years).  Eight  strains  (3.9%)  carried  the  sl/i2/m2  vacA  allele  and  were  obtained  from 
patients  with  a  mean  age  of  57  years  and  an  age  range  of  38  to  72  years.  Six  of  these 
strains  (75.0%)  were  obtained  from  female  patients  (mean  age  of  58  years  with  an  age 
range  of  38  to  72  years),  and  2  (25.0%)  were  obtained  from  male  patients  (mean  age  of 
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52  years  and  an  age  range  of  43  to  61  years).  Of  note,  neither  s2  nor  sl/i2/ml  vacA 
alleles  were  found. 

Distribution  of  vacA  Allele  and  Gender 

Statistical  analysis  of  allele  distributions  showed  a  significant  association  between 
the  vacA  allele  and  gender  (P=0.0233).  Patients  that  carried  non  sl/il/ml  strains  were 
4.3  times  more  likely  to  be  female  than  male.  This  difference  is  likely  driven  by  the  m 
region,  since  the  distribution  of  any  combination  that  contained  the  m  region  when 
compared  to  gender  was  statistically  significant  (P=0.005  for  s/m  and  P=0.0233  for  m/i), 
whereas  the  combination  lacking  m  (P=0.1672  for  s/i)  was  not  significant.  Moreover, 
when  the  distribution  of  polymorphisms  within  each  region  was  analyzed  alone  versus 
gender,  only  the  distribution  of  the  m  polymorphisms  was  significant  (P=0.008).  In  fact, 
if  a  patient  contained  an  m2  allele  they  were  3.75  times  more  likely  to  be  female  than 
male.  This  fact  combined  with  the  fact  that  the  ml  allele  appears  to  affect  toxicity  to  a 
larger  variety  of  cells  (22),  may  contribute  to  the  finding  that  males  are  1.5  to  2.5  times 
more  likely  to  develop  gastric  cancer  than  females  (reviewed  in  43). 

Associations  among  vacA,  cagA,  and  Disease 

Given  the  diversity  of  the  identified  roles  of  the  VacA  toxin,  we  assessed  whether 
the  distribution  of  the  vacA  alleles  had  a  direct  impact  on  disease  state.  First,  the 
individual  regions  were  assessed  for  their  impact  on  disease  development.  The 
distribution  of  polymorphisms  in  the  m  region  and  i  region  among  disease  state  was  not 
statistically  significant  (P=0.5397  and  P=0.7399,  respectively),  and  the  distribution  of 
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polymorphisms  in  the  s  region  in  relationship  to  disease  state  could  not  be  detennined 
because  only  the  s  1  allele  was  found  within  this  population.  Statistical  analysis  for  two 
way  associations  using  SASS  software  showed  no  statistical  association  between  the 
distribution  of  the  vacA  alleles  and  disease  state  (P=  0.7499).  However,  log  linear 
modeling  taking  into  account  age  and  gender,  did  reveal  a  two  way  association  between 
vacA  allele  and  disease  only  in  the  East  Asian  (EPIYA-ABD)  strains  (P=0.030).  The 
majority  of  East  Asian  EPIYA-ABD  CagA  strains  carrying  non  sl/il/ml  vacA  alleles 
were  associated  with  duodenal  ulcers.  Conversely,  strains  carrying  non  sl/il/ml  vacA 
alleles  and  any  other  genotype  of  CagA  were  associated  with  gastritis.  A  complete 
breakdown  of  the  vacA  allele  and  disease  state  is  provided  in  Table  5. 

VacA  was  previously  suggested  to  interact  synergistic  ally  with  the  H.  pylori 
virulence  factor,  CagA  (2,  36).  Thus,  we  next  analyzed  whether  there  was  any 
association  between  the  distribution  of  vacA  alleles,  the  distribution  of  cagA  alleles,  and 
disease  state.  Of  the  225  strains  that  were  genotyped  for  the  vacA  allele,  224  of  these 
strains  had  previously  been  genotyped  for  the  cagA  allele.  Of  these  strains,  199  isolates 
(88.8%)  can  be  classified  as  East  Asian  (encoding  for  an  EPIYA-D  motif),  and  25 
isolates  (1 1.2%)  were  detennined  to  be  classified  as  Western  strains  (encoding  for  at  least 
one  EPIYA-C  motif;  see  Table  S2  in  the  supplemental  material).  Eight  East  Asian  strains 
earned  an  EPIYA  motif  other  than  a  defined  -ABD  motif,  either  incomplete  or 
containing  the  addition  of  one  or  more  motifs,  including  -AABD,  -BD,  -BBD,  - 
ABAB*D,  as  well  as  -AB*D,  where  a  mutation  within  the  EPIYA-B  motif  is  designated 
by  the  asterick.  Based  on  these  distributions,  the  strains  were  subdivided  based  on  the 
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Figure  9:  Schematic  depiction  of  the  distribution  of  the  vacA  genotypes  stratified  by 
disease  state  and  cagA  allele  within  this  South  Korean  population.  Shown  is  the 
distribution  of  vacA  genotypes  within  the  four  different  disease  states:  gastritis,  gastric 
ulcers,  duodenal  ulcers,  and  cancer.  The  shaded  portions  within  the  disease  state  and 
vacA  genotype  subgroupings  correspond  to  the  isolates  that  carry  a  cagA  EPIYA-ABD 
motif. 


Figure  9:  Schematic  depiction  of  the  distribution  of  the  vacA  genotypes  stratified  by 
disease  state  and  cagA  allele  within  this  South  Korean  population 


199 


Sl/i1/m2  sl/il/ml  s1/i2/m2 


Table  6:  vacA  genotype  and  cagA  allele\ 


cagA  Allele 


vacA  genotyped  Isolates 

vacA  and  cagA  genotyped  isolates 

EPIYA  -ABD 

Other  alleles 

Overall  Total 

225 

224 

191 

33 

Age  Range 

14-86 

14-86 

14-86 

28-82 

Mean 

50 

50 

50 

49 

Females 

111 

111 

90 

21 

Age  Range 

21-86 

21-86 

21-86 

28-82 

Mean 

52 

52 

52 

50 

Males 

114 

113 

101 

12 

Age  Range 

14-82 

14-82 

14-82 

33-81 

Mean 

49 

49 

49 

49 

sl/il/ml 

206 

205 

180 

25 

Age  Range 

14-86 

14-86 

14-86 

28-81 

Mean 

50 

50 

50 

47 

Females 

96 

96 

82 

14 

Age  Range 

21-86 

21-86 

21-86 

28-61 

Mean 

51 

51 

52 

46 

Males 

110 

109 

98 

11 

Age  Range 

14-82 

14-82 

14-82 

33-81 

Mean 

49 

49 

49 

49 

sl/il/m2 

11 

11 

8 

3 

Age  Range 

38-82 

38-82 

41-68 

38-82 

Mean 

54 

54 

52 

58 

Females 

9 

9 

6 

3 

Age  Range 

38-82 

38-82 

41-68 

38-82 

Mean 

56 

56 

56 

58 

Males 

2 

2 

2 

0 

Age  Range 

41-44 

41-44 

41-44 

N/A 

Mean 

43 

43 

43 

N/A 

sl/i2/m2 

8 

8 

3 

5 

Age  Range 

38-72 

38-72 

54-72 

38-68 

Mean 

57 

57 

62 

53 

Females 

6 

6 

2 

4 

Age  Range 

38-72 

38-72 

54-72 

38-68 

Mean 

58 

58 

63 

56 

Males 

2 

2 

1 

1 

Age  Range 

43-61 

43-61 

61 

43 

Mean 

52 

52 

N/A 

N/A 

*  indicates  any  other  genotype  besides  EPIYA-ABD,  including  Western  strains  and  EPIYA-AABD,  -BD,  -BBD,  -ABAB**D,  as  well 
as  -AB**D  where  a  mutation  within  the  EPIYA-B  motif  is  designated  by  the  ** 

■There  is  a  statistical  two  way  association  between  the  distribution  of  cagA  alleles  and  the  distribution  o  f  vac  A  alleles  (P=0.0007). 
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presence  of  a  complete  CagA  EPIYA-ABD  motif  versus  all  other  EPIYA  motifs,  yielding 
33  isolates  that  were  determined  to  have  “other  genotypes.”  When  the  distribution  of  the 
vacA  alleles  (sl/il/ml,  sl/il/m2,  si/i2/m2)  was  assessed  among  the  distribution  of  cagA 
alleles  (EPIYA-ABD  versus  all  other  genotypes),  a  strong  two  way  association  was 
identified  (P=0.0007;  Fig.  9  and  Table  6).  People  infected  with  strains  carrying  non- 
EPIYA-ABD  cagA  alleles  were  associated  with  two-times-higher  probability  of  carrying 
the  sl/il/m2  vacA  allele  and  10-times-higher  probability  of  carrying  the  sl/i2/m2  vacA 
allele  than  people  infected  with  the  EPIYA-ADB  cagA  allele  (Table  6.  When 
combinations  of  the  regions  were  compared,  the  distribution  of  polymorphisms  among 
the  cagA  alleles  was  statistically  significant  for  every  combination,  s/m  (P=0.0006),  i/m 
(P=0.0007),  and  s/i  (P=0.0019),  and  the  distribution  of  individual  regions  of  vacA,  m 
(P=0.0019)  and  i  (P=0.0019)  versus  the  cagA  allele  were  also  statistically  significant. 
Once  again,  the  distribution  of  polymorphisms  in  the  s  region  among  cagA  alleles  could 
not  be  determined  because  only  the  si  allele  was  found  within  this  population.  This 
indicates  that  each  of  these  regions  is  important  for  the  distribution  of  the  vacA  allele 
with  cagA  allele. 

Given  the  strong  correlation  between  cagA  allele  and  disease  state  we  previously 
observed  (23),  we  next  wondered  if  the  vacA  allele  affected  this  distribution.  Indeed,  log 
linear  modeling  revealed  a  significant  three-way  association  among  vacA  allele,  cagA 
allele,  and  disease  state  (P=0.027).  As  with  the  case  of  gender,  the  distribution  of  any 
combination  that  contained  the  m  region  when  assessed  via  the  distribution  of  cagA  allele 
and  disease  state  was  statistically  significant  (P=0.004  for  s/m  and  P=0.025  for  i/m), 
whereas  the  non  m  combination  (P=0.586  for  s/i)  was  not  significant.  Moreover,  when 


Table  7:  vacA  allele,  cagA  allele  and  disease  state r 


Disease  state  of  vacA  genotyped  isolates 

vacA  and  cagA  genotyped  isolates 

Gastritis 

Gastric  Ulcer 

Duodenal  Ulcer  Gastric  Cancer 

Overall  Total 

224 

103 

42 

49 

30 

EPIYA  -ABD 

192 

83 

38 

40 

30 

Other  genotypes* 

32 

20 

4 

9 

0 

Females 

111 

68 

10 

19 

14 

EPIYA  -ABD 

90 

57 

8 

11 

14 

Other  genotypes* 

21 

11 

2 

8 

0 

Males 

113 

35 

32 

30 

16 

EPIYA  -ABD 

101 

26 

30 

29 

16 

Other  genotypes* 

12 

9 

2 

1 

0 

sl/il/ml 

205 

95 

40 

42 

28 

EPIYA  -ABD 

180 

81 

37 

34 

28 

Other  genotypes* 

25 

14 

3 

8 

0 

Females 

96 

61 

8 

14 

13 

EPIYA  -ABD 

82 

55 

7 

7 

13 

Other  genotypes* 

14 

6 

1 

7 

0 

Males 

109 

34 

32 

28 

15 

EPIYA  -ABD 

98 

26 

30 

27 

15 

Other  genotypes* 

11 

8 

2 

1 

0 
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sl/il/m2 

11 

4 

1 

4 

2 

EPIYA  -ABD 

8 

1 

1 

4 

2 

Other  genotypes* 

3 

3 

0 

0 

0 

Females 

9 

4 

1 

3 

1 

EPIYA  -ABD 

6 

1 

1 

3 

1 

Other  genotypes* 

3 

3 

0 

0 

0 

Males 

2 

0 

0 

1 

1 

EPIYA  -ABD 

2 

0 

0 

1 

1 

Other  genotypes* 

0 

0 

0 

0 

0 

sl/i2/m2 

8 

4 

1 

3 

0 

EPIYA  -ABD 

3 

1 

0 

2 

0 

Other  genotypes* 

5 

3 

1 

1 

0 

Females 

6 

3 

1 

2 

0 

EPIYA  -ABD 

2 

1 

0 

1 

0 

Other  genotypes* 

4 

2 

1 

1 

0 

Males 

2 

1 

0 

1 

0 

EPIYA  -ABD 

1 

0 

0 

1 

0 

Other  genotypes* 

1 

1 

0 

0 

0 
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*  indicates  any  other  genotype  besides  EPIYA-ABD,  including  Western  strains  and  EPIYA-AABD,  -BD,  -BBD,  -ABAB**D,  as  well 
as  -AB**D  where  a  mutation  within  the  EPIYA-B  motif  is  designated  by  the  ** 


-There  is  a  statistical  three  way  association  between  the  distribution  of  cagA  alleles,  the  distribution  of  vacA  alleles,  and  disease  state 
(P=0.027). 
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the  distribution  of  polymorphisms  within  each  region  was  analyzed  individually,  only  the 
distribution  of  the  m  polymorphism  (P=0.01 1)  was  significant.  A  complete  breakdown 
of  the  strains  based  on  disease  state,  vac  A  allele,  and  cagA  allele  is  provided  in  Table  7. 

Sequence  Analysis  of  the  In  termediate  1  Region 

The  i  region  was  previously  suggested  to  be  the  best  indicator  of  pathology 
caused  by  VacA,  and  the  il  type  is  more  virulent  than  i2  (2,  42).  Two  specific  amino 
acid  sequences,  a  phenylalanine  at  position  178  in  cluster  A  and  a  methionine  at  position 
254  in  cluster  C  (Fig.  10),  have  been  identified  as  markers  within  the  Taiwanese 
population  and  a  particular  amino  acid  substitution  at  position  23 1  in  cluster  B  has  been 
suggested  to  affect  disease  severity  (45).  Given  these  reasons,  we  determined  the  il  type 
amino  acid  sequence  for  60  strains  (18  from  cancer  patients,  8  from  gastric  ulcer  patients, 
10  from  duodenal  ulcer  patients,  and  24  from  gastritis  patients;  Fig.  10). 

Sequence  analysis  of  the  i  1  type  in  our  strains  revealed  that  a  phenylalanine  at 
position  178  in  cluster  A  was  present  in  91.7%  of  the  strains  in  the  South  Korean 
population.  Additionally,  the  substitution  of  a  methionine  at  position  254  in  cluster  C 
was  well  conserved  in  the  SouthKorean  population;  98.3%  of  strains  contained  this 
substitution. 

Substitution  of  a  glycine  for  a  serine  at  position  23 1  in  cluster  B  was  previously 
suggested  to  be  statistically  linked  to  disease  development  within  the  Taiwanese 
population  (45).  However,  at  this  site  within  strains  among  the  South  Korean  population, 
neither  the  distribution  of  the  amino  acids  with  regard  to  disease  (P=0.8082)  nor  the 
distribution  of  glycine  as  compared  to  any  other  amino  acid  with  regards  to  disease 
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Figure  10:  Amino  acid  alignment  of  the  il  type  ofVacA.  This  amino  acid  alignment  is 
from  60  Korean  strains  of  various  disease  states:  24  from  gastritis  (G)  patients,  10  from 
duodenal  ulcer  (DU)  patients,  8  from  gastric  ulcer  (GU)  patients,  and  1 8  from  gastric 
cancer  (CA)  patients.  The  abbreviations  listed  after  the  strain,  correspond  to  the  disease 
state  of  the  strain.  The  three  defined  clusters  of  the  il  region,  cluster  A,  B,  and  C,  are 
indicated. 
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Figure  10:  Amino  acid  alignment  of  the  il  type  ofVacA 
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(P=0.5214)  was  statistically  significant.  Additionally,  there  was  no  statistical 
significance  when  the  distribution  of  glycine  was  determined  with  regards  to  individual 
disease  states;  gastritis  (P=0.7871),  gastric  ulcer  (P=0.4329),  duodenal  ulcer  (P=0.7287) 
or  cancer  (P=0.2414).  However,  there  was  a  statistical  association  between  the 
distribution  of  the  glycine  amino  acid  and  the  distribution  of  cagA  alleles  (P=0.0318). 

Sequence  analysis  of  the  il  type  revealed  two  additional  amino  acid 
polymorphisms  across  the  South  Korean  population.  At  position  151,  the  majority  of  the 
strains  carried  a  phenylalanine  (37  strains)  or  a  tyrosine  (23  strains).  This  distribution  of 
amino  acids  at  this  position  was  not  statistically  linked  to  disease  (P=0.3886)  or  the 
distribution  of  cagA  allele  (P=0.6983).  Additionally,  polymorphism  at  position  196  leads 
to  either  a  serine  or  a  leucine  at  this  position.  The  distribution  of  amino  acids  at  this 
position  had  no  statistical  association  with  the  distribution  of  cagA  alleles  (P=  1.0000)  or 
disease  state  (P=0.0669). 


Discussion 

The  majority  (92%)  of  the  South  Korean  isolates  analyzed  in  this  study  carried 
the  sl/il/ml  vacA  allele,  which  was  previously  suggested  to  be  the  most  virulent  fonn  of 
the  toxin  (3,  24,  30-32,  42).  The  fact  that  the  majority  of  H.  pylori  strains  in  this 
population  carry  the  most  toxic  fonn  of  both  VacA  and  CagA,  may  explain  the  high  rate 
of  severe  gastric  disease  among  the  South  Korean  population.  When  age  and  gender 
were  taken  into  account,  a  two  way  association  between  the  distribution  of  vacA  alleles 
and  disease  state  was  found  within  the  strains  carrying  EPIYA-ABD  CagA.  Non 
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sl/il/ml  vacA  alleles  were  associated  with  duodenal  ulcers  within  the  population 
carrying  the  East  Asian  EPIYA-ABD  CagA,  and  gastritis  within  the  population  carrying 
any  other  genotype  of  CagA. 

The  distribution  of  the  m  alleles  varied  significantly  across  gender  and  the  cagA 
allele  and  had  a  significant  impact  on  the  three-way  association  between  the  cagA  allele 
and  disease  state.  This  suggests  that  the  polymorphism  within  the  m  region  is  the  major 
contributor  to  the  association  of  the  vacA  allele,  the  cagA  allele  and  disease  state  within 
this  population.  This  is  in  concordance  with  a  previously  reported  meta-analysis  that 
found  that  the  ml  region  increased  the  risk  for  gastric  cancer  in  Latin  American  (odds 
ratio  [OR]=3.59)  and  African  (OR=TO.  18)  populations  (47).  The  increased  tropism  of  the 
ml  vacA  allele  (37)  combined  with  the  finding  that  the  patients  infected  with  H.  pylori 
strains  encoding  the  m2  allele  are  more  likely  to  be  female  may  explain  why  males  are 
overall  more  likely  to  develop  gastric  cancer  (reviewed  in  43).  To  our  knowledge,  this  is 
the  first  time  that  the  m  allele  distribution  has  been  linked  to  gender.  To  detennine  the 
role  that  the  m  allele  has  the  association  between  gender  and  disease  state,  populations 
where  the  m2  allele  is  more  prevalent,  such  as  in  regions  of  China  (39)  and  Poland  (25), 
should  be  analyzed.  However,  it  should  be  noted  that  this  region  alone  is  not  a  good 
predictor  of  disease  state  for  the  South  Korean  population,  since  two  of  the  strains  that 
carried  the  m2  allele  were  from  cancer  patients  (Table  5). 

Previous  work  with  Western  strains  suggested  that  the  i  region  of  vacA  is  the 
major  determinant  of  vacuolating  activity,  and  is  the  most  important  region  for  disease 
development  (2,  13,  42).  However,  we  found  that  within  this  population  of 
predominantly  East  Asian  strains,  the  i  region  was  not  a  major  determinant  of  disease 
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state.  This  may  indicate  that  the  i  region  is  more  important  within  the  context  of  strains 
that  express  the  Western  cagA  allele  or  that  there  are  other  factors  that  mask  the 
importance  of  this  region  in  East  Asian  isolates. 

Three  clusters  in  the  i  1  region  where  sequence  difference  occur  within  i  1  have 
been  reported:  cluster  A,  B,  and  C  (42).  Amino  acid  substitutions  (tyrosine  to 
phenylalanine  in  cluster  A  and  an  asparagine  to  methionine  in  cluster  C)  are  conserved 
and  predicted  to  serve  as  a  marker  for  Taiwanese  VacA  (45).  Also,  the  substitution  of  a 
glycine  for  a  serine  at  the  ninth  amino  acid  in  cluster  B  was  statistically  linked  to  disease 
development  in  the  Taiwanese  population  (45).  The  amino  acid  substitutions  within 
cluster  A  and  cluster  C  were  also  conserved  in  the  South  Korean  isolates,  indicating  that 
these  substitutions  are  likely  a  marker  of  East  Asian  VacA.  No  correlation  between  the 
ninth  amino  acid  substitutions  in  cluster  B  and  disease  severity  was  identified  for  our 
South  Korean  population,  which  suggests  this  amino  acid  does  not  play  a  role  in  disease 
progression  or  is  important  in  combination  with  another  virulence  factor. 

Two  additional  positions  within  the  il  region  that  showed  polymorphism  were 
identified:  positions  151  and  196.  Neither  the  phenylalanine  nor  the  tyrosine  found  at 
position  151  was  linked  to  disease  state  or  the  distribution  of  cagA  allele.  While  the 
distribution  of  the  amino  acids  at  position  196  had  no  statistical  association  with  the  cagA 
allele  or  disease,  there  was  a  trend  toward  significance:  -78%  of  strains  from  cancer 
patients  and  100%  of  strains  from  gastric  ulcers  patients  carry  a  serine  at  this  position. 
This  suggests  that  additional  populations  should  be  assessed  to  determine  if  a  serine  at 
position  196  has  an  impact  on  disease  development  and  severity. 


212 


Previously  reported  studies  have  identified  an  association  between  vacA  and 
cagA  that  appears  to  affect  //.  pylori  toxicity  and  disease  severity  (55,  57).  Basso  et  al. 
found  that  increasing  numbers  of  CagA  EPIYA-C  motifs  impacted  cancer  risk,  and  that  i 
region  polymorphisms  of  VacA  were  a  major  indicator  for  the  development  of  peptic 
ulcers  (5).  Additionally,  infection  with  strains  carrying  CagA  and  sl/ml  VacA  results  in 
highly  active  corpus  gastritis  (32),  which  is  linked  to  development  of  gastric  cancer  (30- 
32).  In  this  study,  log  linear  modeling,  taking  into  consideration  age  and  gender, 
identified  a  two  way  association  between  the  vacA  allele  and  disease  state  for  East  Asian 
(CagA  EPIYA-ABD)  strains.  Not  surprisingly,  since  the  majority  of  these  strains  carried 
both  CagA  EPIYA-ABD  and  vacA  sl/il/ml,  the  majority  of  cancer  strains  (28  out  of  30) 
carry  this  combination.  This  suggests  that  the  role  of  the  vacA  allele  could  differentially 
affect  disease  progression  based  on  other  virulence  factors.  This  could  be  due  to  the 
finding  that  VacA  acts  as  an  immune  modulator  (15,  52)  and  perhaps  changes  the 
immune  response  to  the  immunogenic  CagA.  Like  CagA,  VacA  was  found  to 
disorganize  the  cytoskeleton  of  gastric  epithelial  cells,  leading  to  increased  cell  spreading 
and  growth  (38).  Thus,  this  phenotype  may  help  compensate  for  the  presence  of  a  less 
virulent  cagA  allele  or  synergistically  contribute  to  severe  gastric  maladies  in  conjunction 
with  East- Asian  CagA.  Evidence  suggests  that  the  combination  of  CagA  and  VacA  may 
dampen  the  effect  of  each  protein  alone,  possibly  leading  to  increased  survival  of  infected 
host  cells  (2).  This,  perhaps,  occurs  through  CagA  preventing  VacA-induced  apoptosis 
(36,  56)  or  by  inhibiting  the  autophagy  pathway  induced  by  VacA  (49). 

CagA  and  VacA  are  the  two  best  studied  virulence  factors  of  H.  pylori. 
Interestingly,  both  toxins  exhibit  a  high  degree  of  polymorphism,  and  it  is  becoming 
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increasingly  evident  that  these  polymorphisms,  alone  and  in  concert,  affect  H.  pylon- 
induced  disease.  Indeed,  the  finding  that  the  majority  of  South  Korean  H.  pylori  strains 
carry  the  most  toxic  form  of  CagA  and  VacA  may  explain  the  reason  for  the  high 
prevalence  of  gastric  disease  and  mortality  of  patients  with  gastric  cancer  in  South  Korea. 
However,  the  reason  why  only  a  portion  of  the  population  develops  gastric  cancer  still 
remains  unclear.  Other  bacterial  virulence  factors  as  well  as  multiple  host,  dietary,  and 
environmental  factors  have  been  indicated  as  being  participants  in  H.  /or/ -induced 

disease  (reviewed  in  8  and  53).  Further  study  is  required  to  detennine  which  factors  are 
involved  and  what  role  they  have  in  the  development  of  H.  pylori- induced  gastric  cancer. 
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Table  S2:  Complete  Korean  Collection 


Strain 

Disease 

Sex 

Age 

cagA  EPIYA 
motif 

cagA  accession 
number 

vacA  allele 

vacA  accession 
number 

Kl-CA 

Cancer 

F 

68 

ABD 

FJ4581 17 

sl/il/m2 

GQ338184 

K2-CA 

Cancer 

F 

64 

ABD 

sl/il/ml 

GQ338205 

K3-CA 

Cancer 

F 

65 

ABD 

FJ4581 18 

sl/il/ml 

GQ338222 

K4-CA 

Cancer 

F 

37 

ABD 

sl/il/ml 

GQ338225 

K5-CA 

Cancer 

M 

70 

ABD 

sl/il/ml 

GQ338227 

K6-CA 

Cancer 

F 

45 

ABD 

sl/il/ml 

GQ338233 

K7-CA 

Cancer 

M 

56 

ABD 

sl/il/ml 

GQ338235 

K8-CA 

Cancer 

M 

56 

ABD 

sl/il/ml 

GQ338239 

K9-CA 

Cancer 

M 

58 

ABD 

FJ4581 19 

sl/il/ml 

GQ338242 

K10-CA 

Cancer 

M 

52 

ABD 

sl/il/ml 

Kll-CA 

Cancer 

M 

68 

ABD 

sl/il/ml 

K12-G 

Gastritis 

F 

52 

ABD 

sl/il/ml 

K13-CA 

Cancer 

M 

38 

ABD 

sl/il/ml 

K14-CA 

Cancer 

F 

78 

ABD 

sl/il/ml 

K15C-CA 

Cancer 

F 

66 

ABD 

FJ458120 

sl/il/ml 

K16-CA 

Cancer 

M 

48 

ABD 

sl/il/ml 

GQ338194 

K17-CA 

Cancer 

F 

56 

ABD 

sl/il/ml 

GQ338196 

K18-CA 

Cancer 

M 

64 

ABD 

sl/il/ml 

K19-CA 

Cancer 

F 

86 

ABD 

sl/il/ml 

K20-CA 

Cancer 

M 

48 

ABD 

sl/il/ml 
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K21-CA 

Cancer 

M 

44 

ABD 

sl/il/ml 

GQ338209 

K22-GU 

Gastric  Ulcer 

M 

42 

ABD 

sl/il/ml 

K23-DU 

Duodenal  Ulcer 

M 

47 

ABD 

sl/il/ml 

GQ338210 

K24-DU 

Duodenal  Ulcer 

M 

38 

ABD 

FJ458121 

sl/il/ml 

K25-DU 

Duodenal  Ulcer 

M 

44 

ABD 

sl/il/ml 

GQ338214 

K26-DU 

Duodenal  Ulcer 

M 

20 

ABD 

sl/il/ml 

GQ338218 

K27-DU 

Duodenal  Ulcer 

M 

47 

ABD 

sl/il/ml 

K28-DU 

Duodenal  Ulcer 

M 

28 

ABD 

sl/il/ml 

K29-DU 

Duodenal  Ulcer 

F 

57 

ABD 

sl/il/ml 

K30-DU 

Duodenal  Ulcer 

F 

61 

ABCCC 

FJ458122 

sl/i2/m2 

K31-DU 

Duodenal  Ulcer 

M 

33 

ABD 

sl/il/ml 

K32-DU 

Duodenal  Ulcer 

F 

41 

ABD 

sl/il/ml 

K33-DU 

Duodenal  Ulcer 

F 

31 

ABC 

FJ458123 

sl/il/ml 

K34-DU 

Duodenal  Ulcer 

M 

43 

ABD 

FJ458124 

sl/il/ml 

GQ338223 

K35-DU 

Duodenal  Ulcer 

F 

56 

ABD 

sl/il/m2 

K36-DU 

Duodenal  Ulcer 

M 

46 

ABD 

sl/il/ml 

K37-DU 

Duodenal  Ulcer 

M 

61 

ABD 

sl/i2/m2 

K38-DU 

Duodenal  Ulcer 

F 

39 

ABC 

sl/il/ml 

GQ338224 

K39-DU 

Duodenal  Ulcer 

M 

59 

ABD 

sl/il/ml 

K40-DU 

Duodenal  Ulcer 

M 

53 

ABD 

sl/il/ml 

K41-DU 

Duodenal  Ulcer 

M 

55 

ABD 

sl/il/ml 

K42-DU 

Duodenal  Ulcer 

F 

48 

ABD 

sl/il/m2 

K43-DU 

Duodenal  Ulcer 

M 

70 

ABD 

sl/il/ml 

K44-DU 

Duodenal  Ulcer 

M 

42 

ABD 

sl/il/ml 

226 


K45-DU 

Duodenal  Ulcer 

M 

22 

ABD 

FJ458125 

sl/il/ml 

K46-DU 

Duodenal  Ulcer 

F 

61 

ABD 

sl/il/ml 

K47-DU 

Duodenal  Ulcer 

F 

72 

ABD 

sl/i2/m2 

K48-DU 

Duodenal  Ulcer 

F 

41 

ABD 

sl/il/ml 

K49-DU 

Duodenal  Ulcer 

M 

33 

ABC 

sl/il/ml 

GQ338226 

K50-DU 

Duodenal  Ulcer 

M 

35 

ABD 

sl/il/ml 

K51-GU 

Gastric  Ulcer 

M 

54 

ABD 

sl/il/ml 

GQ338228 

K52-GU 

Gastric  Ulcer 

M 

46 

ABD 

sl/il/ml 

GQ338229 

K53-G 

Gastritis 

M 

60 

ABD 

sl/il/ml 

K54-G 

Gastritis 

F 

58 

ABD 

FJ458126 

sl/il/ml 

K55-GU 

Gastric  Ulcer 

F 

57 

ABD 

sl/il/ml 

GQ338230 

K56-G 

Gastritis 

F 

48 

ABD 

sl/il/ml 

K57-G 

Gastritis 

F 

63 

ABD 

sl/il/ml 

GQ338231 

K58-G 

Gastritis 

F 

61 

ABD 

sl/il/ml 

GQ338232 

K59-G 

Gastritis 

M 

48 

BBD 

FJ458127 

sl/il/ml 

K60-G 

Gastritis 

M 

53 

ABC 

sl/il/ml 

K61-GU 

Gastric  Ulcer 

F 

57 

* 

* 

K62-GU 

Gastric  Ulcer 

F 

65 

* 

* 

K63-GU 

Gastric  Ulcer 

M 

59 

* 

* 

K64-G 

Gastritis 

F 

61 

ABCC 

FJ458128 

sl/il/ml 

K65-G 

Gastritis 

M 

49 

ABD 

sl/il/ml 

K66-G 

Gastritis 

M 

43 

ABC 

sl/i2/m2 

K67-DU 

Duodenal  Ulcer 

F 

57 

ABD 

sl/il/ml 

K68-GU 

Gastric  Ulcer 

F 

46 

ABD 

sl/il/ml 
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K69-GU 

Gastric  Ulcer 

M 

63 

ABD 

FJ458129 

sl/il/ml 

GQ338234 

K70-G 

Gastritis 

F 

68 

ABC 

FJ458130 

sl/i2/m2 

K71-G 

Gastritis 

M 

54 

ABD 

sl/il/ml 

K72-GU 

Gastric  Ulcer 

M 

34 

ABD 

sl/il/ml 

GQ338236 

K73-GU 

Gastric  Ulcer 

M 

72 

ABD 

sl/il/ml 

K74-G 

Gastritis 

F 

52 

ABD 

sl/il/ml 

GQ338237 

K75-G 

Gastritis 

F 

24 

ABD 

sl/il/ml 

K76-G 

Gastritis 

F 

55 

ABD 

sl/il/ml 

K77-G 

Gastritis 

F 

37 

ABD 

sl/il/ml 

GQ338238 

K78-G 

Gastritis 

M 

36 

A  ABD 

FJ458131 

sl/il/ml 

K79-GU 

Gastric  Ulcer 

F 

84 

ABD 

sl/il/ml 

K80-CA 

Cancer 

F 

61 

ABD 

sl/il/ml 

GQ338240 

K81-GU 

Gastric  Ulcer 

M 

47 

ABD 

sl/il/ml 

GQ338241 

K82-G 

Gastritis 

M 

39 

ABD 

FJ458132 

sl/il/ml 

K83-G 

Gastritis 

F 

75 

ABD 

sl/il/ml 

K84-G 

Gastritis 

M 

48 

ABD 

sl/il/ml 

K85-G 

Gastritis 

F 

28 

BD 

sl/il/ml 

K86-G 

Gastritis 

M 

37 

ABCC 

sl/il/ml 

K87-G 

Gastritis 

F 

52 

ABD 

sl/il/ml 

K88-G 

Gastritis 

F 

69 

ABD 

sl/il/ml 

K89-GU 

Gastric  Ulcer 

M 

38 

ABD 

sl/il/ml 

K90-GU 

Gastric  Ulcer 

M 

51 

ABD 

sl/il/ml 

K91-GU 

Gastric  Ulcer 

M 

82 

ABD 

sl/il/ml 

K92-GU 

Gastric  Ulcer 

M 

41 

ABD 

sl/il/ml 

228 


K93-DU 

Duodenal  Ulcer 

F 

37 

ABC 

FJ458133 

sl/il/ml 

GQ338243 

K94-GU 

Gastric  Ulcer 

M 

65 

ABD 

sl/il/ml 

K95-CA 

Cancer 

F 

41 

ABD 

sl/il/ml 

K96-G 

Gastritis 

F 

47 

ABD 

sl/il/ml 

K97-GU 

Gastric  Ulcer 

M 

51 

ABD 

sl/il/ml 

K98-DU 

Duodenal  Ulcer 

M 

23 

ABD 

sl/il/ml 

K99-G 

Gastritis 

F 

54 

ABD 

sl/i2/m2 

K100-GU 

Gastric  Ulcer 

M 

46 

ABD 

sl/il/ml 

K101-GU 

Gastric  Ulcer 

F 

61 

ABD 

sl/il/ml 

K 102 -DU 

Duodenal  Ulcer 

M 

38 

ABD 

sl/il/ml 

K103-G 

Gastritis 

M 

32 

ABD 

sl/il/ml 

K104-CA 

Cancer 

M 

46 

ABD 

sl/il/ml 

GQ338185 

K105-GU 

Gastric  Ulcer 

M 

71 

ABD 

sl/il/ml 

K106-DU 

Duodenal  Ulcer 

M 

14 

ABD 

sl/il/ml 

K107-DU 

Duodenal  Ulcer 

M 

26 

ABD 

sl/il/ml 

K108-GU 

Gastric  Ulcer 

M 

62 

ABD 

sl/il/ml 

K109-G 

Gastritis 

M 

40 

ABD 

sl/il/ml 

GQ338186 

K110-GU 

Gastric  Ulcer 

M 

81 

ABCC 

FJ458134 

sl/il/ml 

Klll-DU 

Duodenal  Ulcer 

F 

36 

ABD 

sl/il/ml 

GQ338187 

K112-G 

Gastritis 

M 

57 

ABD 

sl/il/ml 

GQ338188 

K113-G 

Gastritis 

M 

29 

ABD 

sl/il/ml 

K114-DU 

Duodenal  Ulcer 

F 

47 

ABC 

sl/il/ml 

K115-G 

Gastritis 

F 

82 

ABC 

FJ458135 

sl/il/m2 

K116-G 

Gastritis 

M 

59 

ABD 

sl/il/ml 
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K117-G 

Gastritis 

F 

21 

ABD 

FJ458136 

sl/il/ml 

GQ338189 

K118-CA 

Cancer 

F 

67 

ABD 

sl/il/ml 

K119-DU 

Duodenal  Ulcer 

M 

31 

ABD 

FJ458137 

sl/il/ml 

K120-G 

Gastritis 

F 

41 

ABC 

sl/il/ml 

GQ338190 

K121-GU 

Gastric  Ulcer 

M 

42 

ABD 

sl/il/ml 

K 122 -DU 

Duodenal  Ulcer 

M 

60 

ABD 

sl/il/ml 

K123-G 

Gastritis 

M 

76 

ABD 

FJ458138 

sl/il/ml 

GQ338191 

K125-G 

Gastritis 

M 

59 

ABD 

sl/il/ml 

K126-GU 

Gastric  Ulcer 

M 

69 

ABD 

sl/il/ml 

K127-GU 

Gastric  Ulcer 

M 

71 

ABD 

sl/il/ml 

K128-GU 

Gastric  Ulcer 

M 

58 

ABC 

sl/il/ml 

GQ338192 

K129-GU 

Gastric  Ulcer 

M 

36 

* 

* 

K130-G 

Gastritis 

F 

64 

* 

* 

K131-G 

Gastritis 

F 

61 

ABD 

FJ458139 

sl/il/ml 

GQ338193 

K132-GU 

Gastric  Ulcer 

M 

23 

* 

* 

K133-GU 

Gastric  Ulcer 

M 

63 

* 

* 

K134-GU 

Gastric  Ulcer 

M 

46 

* 

* 

K135-DU 

Duodenal  Ulcer 

F 

62 

* 

* 

K136-G 

Gastritis 

F 

52 

* 

* 

K137-G 

Gastritis 

F 

62 

* 

* 

K138-GU 

Gastric  Ulcer 

F 

21 

* 

* 

K139-GU 

Gastric  Ulcer 

F 

49 

* 

* 

K140-G 

Gastritis 

F 

49 

* 

* 

K141-G 

Gastritis 

M 

57 

* 

* 
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K142-GU 

Gastric  Ulcer 

M 

65 

* 

* 

K143-GU 

Gastric  Ulcer 

F 

71 

* 

* 

K144-GU 

K145-GU 

Gastric  Ulcer 

Gastric  Ulcer 

F 

53 

* 

* 

M 

62 

* 

* 

K146-G 

Gastritis 

M 

40 

ABD** 

FJ458140 

sl/il/ml 

K147-GU 

Gastric  Ulcer 

M 

62 

* 

* 

K148-GU 

Gastric  Ulcer 

M 

37 

* 

* 

K149-GU 

Gastric  Ulcer 

M 

71 

* 

* 

K150-G 

Gastritis 

F 

26 

ABD 

sl/il/ml 

K151-GU 

Gastric  Ulcer 

M 

65 

ABD 

FJ458141 

sl/il/ml 

K152-G 

Gastritis 

F 

62 

ABD 

sl/il/ml 

K153-GU 

Gastric  Ulcer 

M 

42 

ABD 

sl/il/ml 

K154-G 

Gastritis 

F 

55 

ABCCCC 

FJ458142 

sl/il/m2 

K155-DU 

Duodenal  Ulcer 

N/A 

N/A 

ABD 

sl/il/ml 

K156-DU 

Duodenal  Ulcer 

F 

47 

ABD 

sl/il/ml 

K157-G 

Gastritis 

M 

43 

ABD 

sl/il/ml 

K158-G 

Gastritis 

M 

60 

ABD 

sl/il/ml 

K159-G 

K160-DU 

Gastritis 

Duodenal  Ulcer 

F 

35 

ABD 

sl/il/ml 

M 

30 

ABD 

sl/il/ml 

K161-G 

Gastritis 

F 

65 

ABD 

sl/il/ml 

K162-G 

Gastritis 

F 

63 

ABD 

sl/il/ml 

GQ338195 

K163-G 

Gastritis 

F 

66 

ABD 

sl/il/ml 

K164-G 

Gastritis 

M 

43 

ABD 

sl/il/ml 

K165-G 

Gastritis 

M 

28 

ABD 

sl/il/ml 

K166-G 

Gastritis 

F 

38 

ABC 

sl/i2/m2 

K167-G 

Gastritis 

F 

27 

ABD 

sl/il/ml 

K169-G 

Gastritis 

F 

47 

ABD 

sl/il/ml 

K170-G 

Gastritis 

F 

41 

ABD** 

FJ458143 

sl/il/ml 

K171-CA 

Cancer 

F 

72 

ABD 

FJ458144 

sl/il/ml 

K172-G 

Gastritis 

F 

31 

ABCC 

FJ458145 

sl/il/ml 

K173-G 

Gastritis 

F 

45 

ABD 

FJ458146 

sl/il/ml 

K174-G 

Gastritis 

N/A 

N/A 

ABD 

sl/il/ml 

K175-G 

Gastritis 

F 

41 

ABD 

sl/il/m2 

K176-DU 

Duodenal  Ulcer 

N/A 

N/A 

ABD 

sl/il/ml 

K177-G 

Gastritis 

F 

39 

ABD 

sl/il/ml 

K178-G 

Gastritis 

F 

40 

ABD 

sl/il/ml 

GQ338197 

K179-G 

Gastritis 

F 

38 

ABCCC 

FJ458147 

sl/il/m2 

K180-G 

Gastritis 

(polyps) 

F 

50 

ABD 

sl/il/ml 

K181-DU 

Duodenal  Ulcer 

F 

57 

ABD 

sl/il/m2 

K182-DU 

Duodenal  Ulcer 

N/A 

N/A 

ABD 

sl/il/ml 

GQ338198 

K183-G 

Gastritis 

M 

40 

ABD 

sl/il/ml 

GQ338199 

K184-G 

Gastritis 

F 

55 

ABD 

sl/il/ml 

K185-G 

Gastritis 

F 

52 

ABD 

sl/il/ml 

GQ338200 

K186-G 

Gastritis 

M 

41 

ABD 

sl/il/ml 

K188-G 

Gastritis  (IM) 

F 

43 

ABD 

sl/il/ml 

K190-G 

Gastritis 

M 

61 

ABC 

sl/il/ml 

GQ338201 

K 192 -DU 

Duodenal  Ulcer 

F 

61 

A  ABD 

FJ458148 

sl/il/ml 
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K193-G 

Gastritis 

F 

50 

ABD 

sl/il/ml 

GQ338202 

K194-GU 

Gastric  Ulcer 

M 

48 

* 

sl/il/ml 

K195-GU 

Gastric  Ulcer 

F 

48 

ABD 

FJ458149 

sl/il/ml 

K196-G 

Gastritis 

F 

50 

ABD 

sl/il/ml 

GQ338203 

K197-G 

Gastritis 

F 

45 

ABD 

sl/il/ml 

GQ338204 

K198-GU 

Gastric  Ulcer 

F 

56 

ABC 

FJ458150 

sl/i2/m2 

K199-GU 

Gastric  Ulcer 

M 

50 

ABD 

sl/il/ml 

K200-GU 

Gastric  Ulcer 

M 

63 

ABD 

sl/il/ml 

K201-GU 

Gastric  Ulcer 

M 

55 

ABD 

sl/il/ml 

K202-GU 

Gastric  Ulcer 

M 

41 

ABD 

sl/il/ml 

K203-G 

Gastritis 

F 

55 

ABD 

sl/il/ml 

K204-GU 

Gastric  Ulcer 

F 

63 

ABD 

sl/il/m2 

K205-GU 

Gastric  Ulcer 

F 

57 

ABD 

sl/il/ml 

K206-GU 

Gastric  Ulcer 

F 

51 

ABD 

sl/il/ml 

K207-G 

Gastritis 

M 

39 

ABD 

* 

GQ338206 

K208-G 

Gastritis 

F 

56 

ABD 

FJ458151 

sl/il/ml 

GQ338207 

K209-G 

K210-G 

Gastritis 

Gastritis 

F 

24 

ABD 

sl/il/ml 

GQ338208 

F 

61 

ABD 

sl/il/ml 

K211-G 

K212-G 

Gastritis 

Gastritis 

F 

54 

ABD 

sl/il/ml 

F 

45 

ABD 

sl/il/ml 

K213-G 

Gastritis 

F 

52 

* 

* 

K214-G 

Gastritis 

F 

53 

* 

* 

K215-GU 

Gastric  Ulcer 

F 

73 

* 

* 

K216-G 

Gastritis 

F 

67 

ABD 

sl/il/ml 
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K217-G 

Gastritis 

M 

77 

ABD 

* 

K218-G 

Gastritis 

F 

62 

ABD 

sl/il/ml 

K219-G 

Gastritis 

M 

40 

ABD 

FJ458152 

sl/il/ml 

K220-DU 

Duodenal  Ulcer 

N/A 

N/A 

ABD 

sl/il/ml 

K221-G 

Gastritis 

M 

37 

BD 

sl/il/ml 

K222-GU 

Gastric  Ulcer 

M 

41 

ABD 

sl/il/ml 

K223-G 

Gastritis 

F 

25 

ABD 

FJ458153 

sl/il/ml 

K224-G 

Gastritis 

F 

35 

ABD 

sl/il/ml 

K225-DU 

Duodenal  Ulcer 

F 

60 

ABC 

sl/il/ml 

K226-GU 

Gastric  Ulcer 

M 

42 

ABD 

sl/il/ml 

K227-G 

Gastritis 

F 

31 

ABD 

sl/il/ml 

K228-GU 

Gastric  Ulcer 

M 

54 

ABD 

sl/il/ml 

K229-GU 

Gastric  Ulcer 

M 

62 

ABD 

sl/il/ml 

K230-G 

Gastritis  (IM) 

M 

56 

ABD 

sl/il/ml 

K231-GU 

Gastric  Ulcer 

M 

42 

ABD 

sl/il/ml 

K232-G 

Gastritis 

F 

56 

ABD 

sl/il/ml 

K233-G 

Gastritis 

M 

38 

ABD 

sl/il/ml 

K234-DU 

Duodenal  Ulcer 

M 

41 

ABD 

sl/il/m2 

K235-G 

Gastritis 

F 

50 

ABD 

sl/il/ml 

K236-G 

Gastritis 

F 

64 

ABD 

sl/il/ml 

K237-G 

Gastritis 

F 

48 

ABD 

sl/il/ml 

K238-DU 

Duodenal  Ulcer 

M 

55 

ABD 

sl/il/ml 

GQ33821 1 

K239-G 

Gastritis 

M 

46 

ABD 

* 

K240-G 

Gastritis 

F 

41 

ABD 

sl/il/ml 

234 


K241-G 

Gastritis 

M 

41 

ABD 

sl/il/ml 

K242-G 

Gastritis 

M 

78 

ABD 

sl/il/ml 

K243-G 

Gastritis 

F 

60 

BC 

* 

K244-DU 

Duodenal  Ulcer 

N/A 

N/A 

ABD 

sl/il/ml 

K245-G 

Gastritis 

M 

19 

ABD 

sl/il/ml 

K246-G 

Gastritis 

F 

40 

ABD 

sl/il/ml 

K247-G 

Gastritis 

F 

56 

ABD 

sl/il/ml 

K248-G 

Gastritis 

M 

58 

ABD 

sl/il/ml 

GQ338212 

K249-GU 

Gastric  Ulcer 

F 

48 

ABC 

sl/il/ml 

GQ338213 

K250-G 

Gastritis 

F 

53 

ABD 

sl/il/ml 

K251-DU 

Duodenal  Ulcer 

M 

70 

ABD 

sl/il/ml 

K253-DU 

Duodenal  Ulcer 

F 

61 

ABC 

sl/il/ml 

K254-G 

Gastritis 

M 

54 

ABD 

sl/il/ml 

K255-G 

Gastritis 

F 

74 

ABD 

FJ458154 

sl/il/ml 

GQ338215 

K256-G 

Gastritis 

F 

51 

ABD 

sl/il/ml 

K257-CA 

Cancer 

M 

64 

ABD 

sl/il/ml 

K258-CA 

Cancer 

M 

68 

ABD 

FJ458155 

sl/il/ml 

GQ338216 

K259-CA 

Cancer 

M 

44 

ABD 

FJ458156 

sl/il/m2 

GQ338217 

K260-CA 

Cancer 

M 

58 

ABD 

FJ458157 

sl/il/ml 

GQ338219 

K261-CA 

Cancer 

F 

48 

ABD 

FJ458158 

sl/il/ml 

GQ338220 

K262-G 

Gastritis 

F 

56 

ABC 

FJ458159 

sl/il/ml 

GQ338221 

K263-G 

Gastritis 

M 

59 

ABABD*** 

FJ458160 

sl/il/ml 

K264-DU 

Duodenal  Ulcer 

M 

32 

ABD 

FJ458161 

sl/il/ml 

K265-DU 

Duodenal  Ulcer 

M 

42 

ABD 

FJ458162 

sl/il/ml 
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K266-G  Gastritis  F  34  ABD  FJ458163  sl/il/ml 


^Indeterminate  in  genotyping  assay 

**-B  motifs  proline  is  replaced  with  a  serine,  ESIYA,  therefore  classified  as  other 

***-ABABD  second  -B  motifs  proline  is  replaced  with  leucine,  ELIYA,  therefore  classified  as  other 
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Chapter  Four 


Polymorphisms  in  the  Intermediate  Region  of  VacA  Impact  Helicobacter  pylori  -  Induced 

Disease  Development 


Published  as:  Jones,  K.R.*,  S.Jang*,  J.Y.  Chang,  J.  Kim,  I.S.  Chung,  C.H.  Olsen, 

D.S.  Merrell  and  J.H.  Cha  and  D.  S.  Merrell.  2010.  Polymorphisms  in  the  Intermediate 
Region  of  VacA  Impact  Helicobacter  pylori  -  Induced  Disease  Development.  J  Clin 
Microbiol  49:101-10. 

*  These  authors  contributed  equally  to  this  work. 

The  work  presented  in  this  chapter  is  the  sole  work  of  K.  R.  Jones  with  the 
following  exceptions:  S.  Jang  and  J.Y.  Chang  performed  the  vacA  sequencing,  J.H.  Cha 
assisted  with  experimental  design,  J.  Kim  assisted  in  Genebank  registration,  I.S.  Chung 
performed  the  biopsies  and  supplied  the  diagnoses,  and  C.H.  Olsen  assisted  with  the 
statistical  analysis. 


Abstract 

Helicobacter  pylori  is  the  etiological  agent  of  diseases  such  as  gastritis,  gastric 
and  duodenal  ulcers,  and  two  types  of  gastric  cancers.  While  some  insight  has  been 
gained  into  the  etiology  of  these  diverse  manifestations,  by  and  large,  the  reason  that 
some  individuals  develop  more  severe  disease  remains  elusive.  Recent  studies  have 
focused  on  the  role  of  the  H.  pylori  toxins,  CagA  and  VacA,  on  the  disease  process  and 
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have  suggested  that  both  toxins  are  intimately  involved.  Moreover,  CagA  and  VacA  are 
polymorphic  within  different  //.  pylori  strains  and  particular  polymorphisms  seem  to 
show  a  correlation  to  development  of  particular  disease  states.  Among  VacA 
polymorphisms,  the  intermediate  region  has  recently  been  proposed  to  play  a  major  role 
in  disease  outcome.  Herein  we  describe  a  detailed  sequence  analysis  of  the  polymorphic 
intennediate  region  of  vacA  from  strains  obtained  from  a  large  South  Korean  population. 
We  show  that  polymorphisms  found  at  amino  acid  position  196  are  associated  with  more 
severe  disease  manifestations.  Additionally,  polymorphisms  found  at  amino  acid  position 
23 1  are  linked  to  disease  in  strains  that  carry  the  non  EPIYA-ABD  allele  of  CagA. 
Collectively,  these  data  help  explain  the  impact  of  the  VacA  intermediate  region  on 
disease  and  lead  to  the  hypothesis  that  there  are  allele-driven  interactions  between  VacA 
and  CagA. 


Introduction 

The  medically  important  microbe,  Helicobacter  pylori  colonizes  the  inhospitable 
niche  of  the  gastric  mucosa  of  over  fifty  percent  of  the  world’s  population  (34,  54).  H. 
pylori  is  a  spiral-shaped,  microaerophilic,  Gram-negative  bacterium  (33)  that  is  the 
etiological  agent  of  a  multitude  of  diseases,  including  gastritis,  peptic  ulcers  (both 
duodenal  and  gastric  ulcers),  as  well  as  adenocarcinoma  and  MALT  lymphoma  (6,  12, 
17,  18,  45).  This  class  I  carcinogen  contributes  to  gastric  cancer  mortality,  which  is  still 
one  of  the  most  common  causes  of  mortality  due  to  cancer  (18,  42).  This  is  especially 
true  in  East  Asian  countries  such  as  China,  Korea,  and  Japan  (23). 
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Two  H.  pylori  toxins  that  facilitate  host  cellular  damage  and  directly  interplay 
with  the  host  immune  system  are  the  cytotoxin  associated  gene  A  (CagA)  and  the 
vacuolating  cytotoxin  (Vac A).  A  growing  number  of  studies  have  begun  to  suggest  that 
VacA  and  CagA  interact  in  such  a  way  as  to  affect  disease  severity  (2,  27,  59,  62).  VacA 
was  identified  within  a  few  years  of  the  discovery  of  H.  pylori,  and  appears  to  be 
produced  and  secreted  by  all  strains  (4,  13).  This  toxin  was  initially  identified  and  named 
due  to  its  ability  to  cause  large  cytoplasmic  vacuoles  in  intoxicated  host  cells  (14). 
However,  VacA  has  subsequently  been  shown  to  have  a  multitude  of  functions.  For 
instance,  when  inserted  into  the  plasma  membrane,  VacA  can  act  as  an  anion-selective 
channel  (52),  which  may  aid  bacterial  survival  through  leakage  of  host  cytosolic  anions 
that  can  be  utilized  by  the  bacterium  (40).  VacA  also  has  the  ability  to  induce  apoptosis 
through  penneabilization  of  the  mitochondrial  membrane,  thereby  causing  cytochrome  c 
release  (32,  60).  Furthermore,  VacA  can  induce  the  autophagy  pathway  (53),  disorganize 
the  host  cell  cytoskeleton  to  cause  spreading  of  host  cells,  inhibit  T  cell  activation,  and 
block  T  cell  and  B  cell  proliferation  (20,  56). 

CagA  is  directly  injected  into  host  cells  through  a  type  IV  secretion  apparatus 
(10),  and  is  phosphorylated  within  host  cells  by  host-cell  kinases.  This  phosphorylation 
event  subsequently  makes  CagA  competent  for  interaction  with  the  Src  homology  2 
domain-containing  protein  tyrosine  phosphatase  (SHP-2;  23,  26).  The  downstream 
affects  of  this  interaction  include  alterations  in  numerous  host  signaling  pathways  (22, 
24-26,  41,  48,  58),  which  are  believed  to  be  responsible  for  the  increased  cancer  risk 
associated  with  infection  by  strains  that  express  CagA  (7,  21).  Phosphorylation  of  CagA 
occurs  in  the  carboxy-terminus  of  the  protein  at  conserved  tyrosine  residues  that  exist  as 
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part  of  a  repeated  five  amino  acid  sequence  (Glu-Pro-Ile-Tyr-Ala)  referred  to  as  the 
EPIYA  repeat  (25,  26).  The  numbers  of  these  EPIYA  repeats  and  the  flanking  amino 
acid  regions  surrounding  these  repeats  vary  dramatically  across  strains.  Based  on  flanking 
sequences,  four  distinct  EPIYA  motifs  have  been  identified  (-A,  -B,  -C,  -D;  25,  61). 
Strains  that  carry  various  combinations  of  these  motifs  can  be  divided  into  two  main 
geographical  distributions,  which  are  hallmarked  by  differences  in  the  primary 
phosphorylation  sites,  EPIYA  -C  or  -D  (25);  East  Asian  strains  contain  EPIYA-ABD 
whereas  Western  strains  contain  EPIYA-ABC,  where  the  EPIYA-C  motif  may  be 
repeated  up  to  five  times  (2,  25,  26,  43).  These  different  EPIYA  combinations  have  been 
suggested  to  impact  disease  progression  (29,  61). 

Similar  to  the  polymorphic  nature  of  CagA,  VacA  contains  three  distinct 
segments  that  exhibit  variation  within  the  amino-terminus.  These  areas  of  variation  are 
broadly  defined  as  the  signal  (s),  intermediate  (i),  and  middle  (m)  regions  and  two  or 
more  primary  variants  have  been  described  for  each  region:  signal,  si  or  s2, 
intennediate,  i  1 ,  i2,  or  i3,  and  middle,  ml  or  m2  (3,  11,  47).  Various  combinations  of 
each  s,  i,  and  m  region  are  then  combined  within  each  H.  pylori  strain  to  yield  a  particular 
vacA  allele.  The  s  region  of  VacA  appears  to  influence  the  efficiency  of  anion  channel 
formation  based  on  the  hydrophobicity  of  amino  acid  residues  that  are  found  near  a 
proteolytic  cleavage  site  found  in  this  region  (35,  46);  the  si  form  contains  a  hydrophobic 
region  adjacent  to  the  proteolytic  cleavage  site  that  increases  membrane  insertion  and 
formation  of  membrane  channels  (30,  35).  The  m  region  affects  host  cell  tropism  (28); 
VacA  toxins  encoding  the  ml  region  are  toxic  to  a  broader  range  of  host  cells  (1,  44). 

The  i  region  is  positioned  between  the  s  and  m  regions  and  is  the  most  recent  region  to  be 
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described.  The  il  variants  of  VacA  have  been  shown  to  have  stronger  vacuolating 
activity  than  toxins  containing  the  i2  regions  (47).  Due  to  the  increased  anion  channel 
formation  capability,  broader  cell  tropism  and  enhanced  vacuolating  activity,  individual 
associations  between  the  si,  ml,  and  il  types  and  more  severe  forms  of  H.  pylori  induced 
disease  have  been  identified  (5,  47).  Furthennore,  several  studies  have  linked  strains 
carrying  the  sl/ml  allele  of  the  toxin  to  more  severe  disease  outcomes,  since  these  strains 
show  the  strongest  vacuolating  activity  to  the  broadest  range  of  cells  (31).  However, 
recently  the  i  region  has  been  suggested  to  be  a  better  predictor  of  disease  severity  than 
either  the  s  or  m  region,  though  i  appears  to  co-vary  with  the  s  and  m  regions  (47).  This 
means  that  the  more  toxic  il  region  is  often  associated  with  sl/ml  (47). 

Within  the  i  region,  three  specific  clusters  have  been  identified  as  the  main  areas 
to  contain  polymorphisms:  clusters  A,  B,  and  C.  Among  these,  cluster  B  and  C  have  been 
shown  to  impact  the  vacuolating  activity  of  the  toxin  (47).  Because  of  this  link  to  toxin 
activity,  researchers  have  sought  to  determine  the  role  of  natural  individual  amino  acid 
changes  within  these  clusters  in  ultimate  disease  outcome.  For  example,  variation  in  the 
ninth  amino  acid  in  cluster  B  (amino  acid  23 1  of  the  protoxin)  has  been  linked  to  disease 
development  in  the  Taiwanese  population  (50).  However,  this  amino  acid  appeared  to 
have  no  impact  within  the  South  Korean  population  (27).  Recently,  we  identified  two 
additional  positions  in  the  VacA  intermediate  region  that  contained  polymorphisms: 
position  151  and  196  (27).  While  neither  of  these  amino  acids  showed  a  statistically 
significant  link  to  disease  severity  in  the  small  population  of  samples  that  were  examined, 
the  distribution  of  amino  acids  found  at  position  196  displayed  a  trend  toward 
significance  (27).  Given  this,  we  sequenced  the  vacA  intermediate  region  from  23 1 
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South  Korean  isolates  and  then  analyzed  the  distribution  of  polymorphisms  across  the 
entire  region.  Furthennore,  we  compared  these  polymorphic  vacA  distributions  to  the 
various  cagA  alleles  carried  by  each  strain  as  well  as  to  the  ultimate  disease  development. 
Herein,  we  present  an  expanded  il  and  i2  consensus  sequence,  show  that  amino  acid  196 
impacts  disease  development,  and  show  that  amino  acid  23 1  is  important  for  disease 
development,  but  only  within  strains  that  carry  a  non-EPIYA-ABD  cagA  allele. 

Material  and  Methods 
Bacterial  Strains  and  Culture  Conditions 

This  South  Korean  population  of  260  strains  has  been  previously  described  and 
includes  115  gastritis  isolates,  60  gastric  ulcer  isolates,  55  duodenal  ulcer  isolates,  and  30 
gastric  cancer  isolates  (27,  29).  Isolates  were  preserved  as  stocks  at  -80°C  and  then 
grown  and  expanded  on  antibiotic-supplemented  horse  blood  agar  plates  under 
microaerophilic  conditions  created  by  an  Anoxomat  evacuation/replacement  system 
(Spiral  Biotech,  Norwood,  MA)  exactly  as  previously  described  (9,  29). 

vacA  i  Region  Sequencing 

Chromosomal  DNA  of  each  of  the  260  H.  pylori  strains  was  isolated  using  the 
Easy-DNA  kit  (Invitrogen,  Carlsbad,  CA).  The  vacA  intermediate  region  was  amplified 
and  then  Sanger  dideoxy  sequenced  using  the  primers  previously  described  by  Rhead,  et 
al.:  VacFl  3’-GTTGGGATTGGGGGAATGCCG-5’  and  VacR9  3’- 
TGTTTATCGTGCTGTATGAAGG-5’  (47).  Sanger  dideoxy  sequencing  was 
performed  at  both  the  Unifonned  Services  University  of  the  Health  Science  Biomedical 
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Instrumentation  Center  (Bethesda,  MD)  and  Cosmo  Genetech  Co.,  Ltd  (Seoul,  South 
Korea).  Resulting  DNA  sequences  were  analyzed  using  Vector  NTI  version  9.1 
(Invitrogen,  Carlsbad,  CA)  and  Sequencher  4.5  (Gene  Codes  Corp.,  Ann  Arbor,  MI). 

The  amino  acid  numbering  system  used  in  this  study  is  based  off  the  VacA  sequence  of 
strain  G27;  numbering  begins  at  the  translational  start  such  that  amino  acid  1  is  the  first 
methionine  of  the  translated  protein. 

Statistical  Analysis 

The  Fisher’s  exact  test  was  used  to  analyze  the  association  between  the  vacA 
allele,  disease  state,  cagA  allele,  and  specific  amino  acids  within  the  intermediate  region. 
Log  linear  modeling  was  used  to  assess  higher  order  associations  that  were  significant  at 
the  5%  level.  We  fit  a  saturated  model  using  categorical  variables  representing  vacA 
genotype,  cagA  genotype,  disease  state,  gender,  and  amino  acids  within  the  i  region  using 
a  backward  selection  algorithm,  which  eliminates  the  least  significant  association  at  each 
step  and  then  reforms  the  model  to  look  for  associations.  Data  were  analyzed  using  SPSS 
version  16  software  (SPSS  Inc.,  Chicago,  IL)  or  SAS  version  9.1  software  (SAS  Institute 
Inc.,  Cary,  NC). 

Nucleotide  Sequence  Accession  Numbers 

The  sequences  for  the  i  region  of  vacA  from  the  60  original  strains  analyzed  (27) 
were  previously  deposited  in  Genbank  under  accession  numbers  GQ338184  to 
GQ338243  and  the  i  region  sequences  of  the  additional  171  strains  have  been  deposited 
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in  GenBank  under  accession  numbers  HM047564  to  HM047592  and  HM047594  to 
HM047735  (see  Table  S3  in  the  supplemental  material). 

Results 

Sample  Acquisition/vacA  i  Region  Sequencing 

The  strains  used  in  this  study  have  been  previously  characterized  for  distribution 
of  both  the  cagA  allele  (27,  29)  and  the  vacA  allele  (27).  However,  the  previous  study 
characterizing  the  vacA  allele  relied  primarily  on  PCR-based  typing  methods  and  only 
analyzed  the  vacA  intennediate  region  sequence  from  a  subset  of  strains  carrying  the  il 
allele;  whereas  the  current  study  represents  a  detailed  analysis  of  vacA  sequences  from 
the  complete  collection  of  South  Korean  strains.  The  complete  collection  of  260  strains 
contained  254  strains  for  which  we  had  complete  epidemiological  data.  These  254  strains 
were  obtained  from  patients  with  a  mean  age  of  5 1  years  and  an  age  range  of  14  to  86 
years.  These  strains  were  evenly  distributed  by  gender;  126  strains  were  from  female 
patients  with  a  mean  age  of  53  years  and  an  age  range  of  21  to  86  years,  and  128  strains 
were  from  male  patients  with  a  range  of  14  to  82  years  and  a  mean  age  of  50  years.  These 
strains  were  distributed  across  various  II.  pylori-induced  disease  states:  45%  from 
patients  diagnosed  with  gastritis,  22%  with  gastric  ulcers,  21%  with  duodenal  ulcers,  and 
12%  with  gastric  cancer  (29). 

Of  the  260  strains  in  this  entire  collection,  23 1  strains  were  successfully 
sequenced  for  the  vacA  i  region  and  222  of  these  strains  contained  complete 
epidemiological  data  as  well  as  CagA  and  VacA  genotypes  (Fig.  11,  Table  8,  and  Table 
S3  in  the  supplemental  material).  These  sequenced  regions  were  from  strains  from 


Table  8:  Distribution  of  cagA  and  vacA  allele  across  the  different  disease  states. 


Gastritis 

Duodenal  Ulcers 

Gastric  Ulcers 

Gastric  Cancer 

Overall 

Total 

222 

101 

49 

42 

30 

I4¬ 

Age  Range 

86 

19-82 

14-72 

34-84 

37-86 

Mean  Age 

50.3 

48.3 

45.1 

55.1 

56.6 

Male 

112 

34 

30 

32 

16 

14- 

Age  Range 

82 

19-76 

14-70 

34-82 

38-70 

Mean  Age 

48.9 

46.6 

41.3 

54.5 

55.1 

Female 

110 

67 

19 

10 

14 

21- 

Age  Range 

86 

21-82 

31-72 

46-84 

37-86 

Mean  Age 

51.7 

49.2 

51.1 

57.1 

61 

CasA 

EPIYA-ABD 

189 

81 

40 

38 

30 

I4¬ 

Age  Range 

86 

19-78 

14-72 

34-84 

37-86 

245 


Age  Range 
Mean  Age 

38- 

72 

58.2 

38-68 

53.3 

61-72 

66.5 

56 

N/A 

N/A 

N/A 

sli3ml 

3 

1 

1 

0 

1 

32- 

Age  Range 

86 

32 

47 

N/A 

86 

Mean  Age 

55 

N/A 

N/A 

N/A 

N/A 

Male 

2 

1 

1 

0 

0 

32- 

Age  Range 

47 

32 

47 

N/A 

N/A 

Mean  Age 

39.5 

N/A 

N/A 

N/A 

N/A 

Female 

1 

0 

0 

0 

1 

Age  Range 

86 

N/A 

N/A 

N/A 

86 

Mean  Age 

N/A 

N/A 

N/A 

N/A 

N/A 

*indicates  any  other  genotype  besides  EPIYA-ABD,  including  Western  strains  and  EPIYA-AABD,  -BD,  -BBD, 
-ABAB**D,  as  well  as  -AB**D  where  a  mutation  within  the  EPIYA-B  motif  is  designated  by  the  ** 

N/A  stands  for  Not  Applicable 
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Figure  11:  Weblogo  showing  the  Vac  A  intermediate  region ’s  major  polymorphic 
domains  within  the  South  Korean  population.  The  WebLogo  was  created  by  inputting 
the  amino  acid  sequence  from  each  of  the  three  different  intermediate  alleles  (221-il 
sequences,  8-i2  sequences,  and  3-i3  sequences  into  http://weblogo.threelusone.com/  (15, 
49).  The  three  primary  regions  of  polymorphism  (clusters  A,  B,  and  C)  as  well  as  the  two 
amino  acids  shown  to  impact  disease  development  (196  and  231)  are  indicated.  The  logo 
represents  the  alignment  at  each  position  by  a  stack  of  letters,  where  the  height  of  each 
letter  is  proportional  to  the  observed  frequency  of  the  corresponding  amino  acid  and  the 
overall  height  of  each  stack  is  proportional  to  the  sequence  conservation,  measured  in 
bits,  at  that  position  (15). 


Figure  11:  Weblogo  showing  the  Vac  A  intermediate  region ’s  major  polymorphic 
domains  within  the  South  Korean  population 
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patient  with  an  age  range  of  14  to  86  years  and  a  mean  age  of  50.3  years.  Of  the  222 
isolates,  1 12  were  from  male  patients  aged  14  to  82  years  with  a  mean  age  of  48.9  years, 
and  110  were  from  female  patients  aged  21  to  85  years  with  a  mean  age  of  51.7  years. 
The  sequenced  strains  distribution  across  disease  states  was  similar  to  that  of  the  overall 
population;  45.5%  were  from  gastritis  patients,  41%  were  from  patients  suffering  from 
ulcers  (22.1%  duodenal  ulcers  and  18.9%  gastric  ulcers),  and  13.5%  were  from  cancer 
patients. 

Vac  i  Region,  il,  i2,  and  i3 

The  fact  that  the  vacA  i  region  shows  polymorphism  that  may  affect  toxin  activity 
was  only  recently  described.  At  that  time,  consensus  sequences  were  identified  for  the  il 
and  i2  regions  based  on  sequences  from  strains  60190  and  Tx30a,  respectively  (Fig.  1 1 
and  Fig.  12;  47).  Subsequently,  the  consensus  sequences  for  the  il  and  i2  regions  were 
verified  by  analysis  of  123  strains  from  four  distinct  populations  by  Chung,  et  al.,  who 
also  described  an  i3  region  that  appears  to  be  a  hybrid  of  the  il  and  i2  sequences  (11). 

Based  on  this  new  (il,  i2,  and  i3)  nomenclature,  our  South  Korean  population 
contained  four  different  vacA  alleles,  sl/il/ml  (200  isolates),  sl/il/m2  (11  isolates), 
sl/i2/m2  (eight  isolates),  and  sl/i3/ml  (three  isolates;  Table  8).  All  three  of  the  isolates 
defined  as  having  an  i3  region  contained  an  i2  cluster  B  consensus  sequence  and  an  i  1 
cluster  C  consensus  sequence  (Fig.  1 1  and  Fig.  12).  Of  the  sl/il/ml  strains,  92  were 
from  gastritis  patients,  41  were  from  duodenal  ulcer  patients,  40  were  from  gastric  ulcer 
patients,  and  27  were  from  gastric  cancer  patients.  Four  of  the  sl/il/m2  strains  were 
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from  gastritis  patients,  four  were  from  duodenal  ulcer  patients,  one  was  from  a  gastric 
ulcer  patient,  and  two  were  from  cancer  patients.  Of  the  eight  sl/i2/m2  strains,  four  were 
from  gastritis  patients,  three  were  from  duodenal  ulcer  patients,  and  one  from  a  gastric 
ulcer  patient.  The  three  sl/i3/ml  strains  were  from  one  gastritis  patient,  one  duodenal 
ulcer  patient,  and  one  cancer  patient  (Table  8).  There  was  no  statistical  association 
between  the  distribution  of  the  vacA  alleles  across  the  different  disease  states  (P=0.6865). 

Given  the  fact  that  we  were  able  to  detennine  the  i  region  sequence  from  23 1 
strains,  we  next  examined  the  i  sequences  to  identify  amino  acids  that  predominated  in 
East  Asian  strains,  as  well  as  to  determine  the  amino  acid  differences  between  the  il  and 
i2  vac  A  alleles  that  might  differ  from  the  defined  consensus  sequence  (47).  For  this 
analysis,  we  defined  the  consensus  sequence  in  our  population  as  the  amino  acids 
encoded  for  by  at  least  85%  of  all  strains.  Comparison  of  our  consensus  sequence  to  that 
defined  by  Rhead  et  al.  revealed  the  following:  in  cluster  A  we  found  1)  substitution  of  a 
phenylalanine  instead  of  tyrosine  at  the  first  amino  acid  for  the  il  allele,  2)  reversal  of 
the  twelfth  and  fourteenth  amino  acids  of  the  i2  allele  (asparagine,  phenylalanine,  aspartic 
acid  versus  aspartic  acid,  phenylalanine,  asparagine  in  our  population),  and  3)  that  the 
main  difference  between  the  i  1  and  i2  vacA  alleles  in  our  population  were  the  fourth,  and 
sixth  amino  acids  (il:  SAD  and  i2:  GAN;  Fig.  12;  47).  In  cluster  B  we  found  1)  our 
population  displayed  variability  at  the  ninth  amino  acid  of  the  i  1  allele  as  compared  to  the 
previously  described  serine,  2)  the  first  two  amino  acids  of  the  i2  alleles  in  our  population 
were  similar  to  the  il  sequence  and  contained  a  glutamine  followed  by  alanine  rather  than 
the  lysine  serine  combination  found  in  other  populations,  and  3)  the  major  differences  in 
cluster  B  between  i  1  and  i2  alleles  in  our  population  were  the  fifth,  ninth,  and  tenth  amino 
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Figure  12:  Consensus  Sequences  for  the  vacA  intermediate  region.  The  consensus 
sequences  of  the  il  and  i2  regions  were  previously  defined  by  Read,  et  al.  and  Chung,  et 
al.  (11,  47)  and  are  shown  in  comparison  to  sequences  obtained  in  this  study.  Dark  gray 
shading  indicates  difference  between  the  different  il  and  i2  consensus  sequences  or 
differences  among  the  i3  sequences.  Light  gray  shading  indicates  amino  acids  that  are 
different  between  il  and  i2  strains.  Positions,  where  there  is  no  consensus  for  a  particular 
amino  acid  are  designated  by  a  period.  Dashes  represent  the  points  of  insertion  of 
additional  amino  acids  that  are  only  found  in  Cluster  C  of  the  i2  allele. 
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Figure  12:  Consensus  Sequences  for  the  vacA  intermediate  region 
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acids  (Fig.  12;  11,  47).  In  cluster  C  we  found  1)  our  population  had  substituted  a 
methionine  for  an  asparagine  at  the  eighth  amino  acid  of  the  il  allele,  2)  the  replacement 
of  a  histidine  for  a  glutamine  at  the  sixth  amino  acid  of  the  i2  allele,  3)  the  major 
differences  in  cluster  C  between  the  i  1  and  i2  alleles  were  the  insertion  of  three  additional 
amino  acids  in  the  i2  allele,  an  asparagine,  histidine,  and  serine,  after  the  fourth  amino 
acid  and  then  subsequent  differences  in  the  first,  third,  sixth,  eighth,  and  tenth  amino 
acids  of  the  il  allele  (correlating  with  the  first,  third,  ninth,  eleventh  and  thirteenth  amino 
acids  of  the  i2  allele;  Fig.  12;  11,  47). 

The  newest  allele  of  the  intermediate  region,  the  i3  region  has  been  defined  as  one 
that  contains  a  cluster  B  from  either  il  or  i2  and  a  cluster  C  from  the  other  allele  (11). 

This  South  Korean  population  contained  three  i3  strains.  All  of  these  strains  contained  a 
cluster  B  from  an  i2  allele  and  a  cluster  C  from  an  il  allele.  The  difference  in  cluster  A 
in  this  population  between  an  il  and  an  i2  are  the  fourth  and  sixth  amino  acids.  Of  note, 
the  three  i3  sequences  showed  no  consensus  between  these  amino  acids,  one  was 
identical  to  the  il,  another  identical  to  the  i2,  and  one  contained  a  different  combination 
all  together  (Fig.  12). 

Amino  Acid  151 

In  depth  sequence  analysis  of  the  entire  i  region  revealed  that,  as  previously 
described  (27),  amino  acid  151  showed  amino  acid  polymorphism.  Strains  contained 
either  a  tyrosine  (91  isolates,  4 1  %)  or  phenylalanine  (131  isolates,  5  9%)  at  this  position. 
The  distribution  of  amino  acids  at  this  position  was  not  associated  with  gender 
(P=0. 1749);  the  distribution  between  males  and  females  was  fairly  even  (Table  9).  In 


Table  9:  Distribution  of  amino  acids  at  polymorphic  positions  within  the  intermediate  region. 


Gastritis 

Duodenal  Ulcers 

Gastric  Ulcers 

Gastric  Cancer 

Amino  Acid  151 

Tyrosine 

91 

41 

15 

20 

15 

Age  Range 

14-84 

25-74 

14-70 

34-84 

38-78 

Mean  Age 

49.8 

46.7 

45.5 

54.3 

56.4 

Male 

51 

17 

10 

17 

7 

Age  Range 

32-81 

28-74 

14-70 

34-81 

38-64 

Mean  Age 

45.9 

45.1 

42.4 

52.7 

51.1 

Female 

40 

24 

5 

3 

8 

Age  Range 

37-84 

25-74 

41-57 

48-84 

41-78 

Mean  Age 

49 

47.9 

51.6 

63 

61 

Phenylalanine 

131 

60 

34 

22 

15 

Age  Range 

19-86 

19-82 

20-72 

41-82 

37-86 

Mean  Age 

50.7 

49.9 

44.9 

55.9 

59.3 

Male 

61 

17 

20 

15 

9 

Age  Range 

19-82 

19-78 

20-70 

41-80 

44-70 

Mean  Age 

49.8 

50 

40.7 

56.5 

58.2 

Female 

70 

43 

14 

7 

6 

Age  Range 

21-86 

21-82 

31-72 

46-63 

37-86 

Mean  Age 

51.5 

49.9 

50.9 

54.6 

61 

Amino  Acid  231 

Glycine 

153 

67 

34 

30 

22 

Age  Range 

14-84 

19-82 

14-70 

34-82 

37-78 

Mean  Age 

50.3 

50.2 

42.6 

54.6 

57 
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e 


83 


Age  Range 

14-84 

19-78 

14-70 

34-82 

38-68 

Mean  Age 

48.6 

49.3 

40.7 

53 

54.5 

Female 

70 

42 

11 

6 

11 

Age  Range 

25-84 

25-82 

31-61 

48-84 

37-78 

Mean  Age 

52.3 

50.7 

46.8 

60.7 

59.5 

Other  AA* 

69 

34 

15 

12 

8 

Age  Range 

21-86 

27-74 

31-72 

41-81 

44-86 

Mean  Age 

50.4 

45.5 

50.5 

57.3 

60.3 

Male 

29 

9 

7 

8 

5 

Age  Range 

29-81 

29-61 

31-61 

41-81 

44-70 

Mean  Age 

50 

42.6 

43.3 

60.1 

47.6 

Female 

Age  Range 

Mean  Age 

40 

21-86 

50.7 

25 

21-74 

46.6 

Amino  Acid  196 

Serine 

148 

64 

Age  Range 

19-84 

19-82 

Mean  Age 

50.9 

48.8 

Male 

76 

22 

Age  Range 

19-82 

19-76 

Mean  Age 

49.6 

46.6 

Mean  Age 


LMlij 


Female 


72 


42 


10 


Age  Range 

Mean  Age 

21-84 

52.3 

21-82 

48.4 

31-61 

48.7 

Leucine 

74 

37 

21 

Age  Range 

14-86 

24-78 

14-72 

Mean  Age 

49.1 

48.4 

47 

Male 

36 

12 

12 

Age  Range 

14-78 

28-78 

14-70 

Mean  Age 

48.8 

49.2 

42.1 

Female 

38 

25 

9 

Age  Range 

24-86 

24-74 

39-72 

Mean  Age 

50.7 

48 

53.7 

*  Other  AA  indicates  strains  containing  any  amino  acid  other  than  a  glycine  at  this  position 


N/A  stands  for  Not  Applicable 
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agreement  with  the  smaller  subset  of  isolates  we  previously  analyzed  (27),  there  was  no 
association  with  variation  at  this  residue  and  the  cagA  allele  (P=0.4433).  Variation  was 
also  not  associated  with  the  overall  vacA  allele  (P=0.2177),  nor  with  either  the  i 
(P=0. 1692)  or  m  (P=0.8097)  sub-regions  of  the  vacA  allele.  Polymorphisms  at  this 
amino  acid  also  did  not  impact  disease  state,  even  when  individual  disease  states  were 
compared  directly  to  each  other  (Table  10). 

Amino  Acid  231 

Amino  acid  231,  which  is  the  ninth  amino  acid  in  cluster  B,  was  previously  shown 
to  contain  amino  acid  polymorphisms  that  are  important  for  disease  development  in  a 
Taiwanese  population  (50).  In  our  population,  we  identified  live  different  amino  acids  at 
this  position:  glycine,  serine,  aspartic  acid,  asparagine,  and  arginine.  To  determine  if 
distribution  of  residues  at  this  position  was  important,  statistical  associations  were 
analyzed  across  the  distribution  of  all  the  different  amino  acids  as  well  as  glycine  versus 
all  other  amino  acids  combined;  a  previous  study  identified  the  presence  of  the  glycine 
residue  as  important  for  the  progression  to  more  severe  disease  (50).  Since  we  identified 
no  difference  between  which  associations  were  statistically  significant  and  since  the 
previous  literature  assessed  the  glycine  residue  versus  any  other  amino  acid  (50),  the 
numbers  we  present  compare  the  distribution  of  glycine  versus  all  other  amino  acids 
found  at  position  23 1 .  The  distribution  of  amino  acids  at  position  23 1  was  not  associated 
with  gender  (P=0. 1 109;  Table  9).  The  strong  association  we  previously  identified  using  a 
subset  of  South  Korean  isolates  (27)  between  the  distribution  of  amino  acid 
polymorphisms  at  this  position  and  the  distribution  of  the  cagA  allele  was  maintained  in 
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this  larger  population  (P=0.0081).  More  specifically,  among  EPIYA-ABD  strains, 
glycine  was  more  prevalent  at  this  position  (72.5%)  than  among  strains  carrying  a  non 
EPIYA-ABD  cagA  allele  (48.5%).  Polymorphisms  at  this  site  were  also  associated  with 
the  overall  vacA  allele  (P<0.0001),  every  sub-region  of  the  vacA  allele  (i  region  P<0.0001 
or  m  region  P=0.0168),  and  every  combination  of  sub-regions  (s  and  i  P<0.0001,  m  and  i 
P<0.0001  and  s  and  m  P=0.0168).  However,  even  though  glycine  at  this  position  was 
previously  identified  as  important  for  the  progression  to  more  severe  disease  (50),  we 
found  that  polymorphisms  at  this  position  did  not  impact  disease  state.  Additionally  there 
was  no  significance  when  comparing  across  any  individual  disease,  regardless  if 
individual  disease  states  were  compared  alone  or  if  a  combination  of  disease  states  was 
analyzed  (Table  10).  These  data  suggest  that  this  amino  acid  is  not  important  for  disease 
progress  or  that  other  factors  mask  the  contribution  of  this  amino  acid  to  disease  state  in 
the  South  Korean  population. 

Amino  Acid  196 

As  we  previously  described  in  a  subset  of  South  Korean  strains  (27),  amino  acid 
196  was  identified  as  a  position  that  contained  amino  acid  polymorphisms.  In  this  larger 
population,  either  a  serine  (148  isolates,  67%)  or  a  leucine  (74  isolates,  33%)  was  found 
at  this  position.  The  distribution  of  amino  acids  at  this  position  was  not  associated  with 
gender  (P=0.7762)  and  was  fairly  evenly  distributed  between  males  and  females  (Table 
9).  Additionally,  there  was  still  no  association  with  the  cagA  allele  (P=0.6928). 

However,  polymorphisms  at  this  residue  were  associated  with  the  vacA  allele  (P=0.0003). 
This  association  was  present  for  the  i  region  of  the  vacA  allele  (P=0.0001),  or  any 
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combination  that  contained  the  i  region  (s  and  i  P=0.0001  and  m  and  i  regions  P=0.0003). 
However,  there  was  no  association  when  the  m  region  was  assessed  alone  (P=0.7600)  or 
when  the  combination  of  regions  did  not  include  the  i  region  (s  and  m  P=0.0760).  In  fact, 
all  of  the  i2  strains  contain  a  leucine  at  this  position  (Fig.  1 1). 

Polymorphisms  at  this  amino  acid  did  not  impact  disease  state  (P=0.0624)  as  a 
whole  (Table  10).  Additionally,  this  position  was  not  significant  when  gastritis  was 
compared  to  any  non-cancer  disease  states:  gastritis  versus  all  other  disease  states 
(duodenal  ulcers,  gastric  ulcers,  and  gastric  cancers;  P=0.3916),  gastritis  versus  peptic 
ulcers  (both  duodenal  and  gastric;  P=0.8808),  gastritis  versus  duodenal  ulcers 
(P=0.4796),  or  gastritis  versus  gastric  ulcers  (P=0.2500).  It  was  also  not  significant  when 
peptic  ulcers  were  compared  to  non-cancer  disease  states:  duodenal  ulcers  versus  all 
other  disease  states  (P=0.1237),  duodenal  ulcers  versus  gastric  ulcers  (P=0.1250),  or 
gastric  ulcers  versus  all  other  disease  states  (P=0.3636).  There  was  also  no  association 
with  the  amino  acid  at  position  196  with  gastric  cancers  versus  peptic  ulcers  (P=0.0689). 
However,  we  did  find  an  association  between  the  amino  acid  at  this  position  and  the 
development  of  gastric  cancer  when  compared  to  development  of  all  other  disease  states 
(P=0.0389),  versus  duodenal  ulcers  alone  (P=0.0254),  and  versus  gastritis  alone 
(P=0.0460).  Additionally,  there  was  a  statistical  association  between  polymorphisms  at 
position  196  and  more  severe  disease  manifestations  (gastric  ulcers  and  gastric  cancer) 
versus  less  severe  disease  manifestations  (gastritis  and  duodenal  ulcers;  P=0.0155;  Table 
10).  While  the  presence  of  a  serine  at  this  position  was  more  prevalent  across  all  patients, 
patients  suffering  from  gastric  cancer  were  five  times  more  likely  than  other  patients  to 
carry  a  serine  at  this  location.  These  data  suggest  that  position  196  does  impact  disease 


Table  10:  P  values  of  the  distribution  of  amino  acids  at  polymorphic  positions  across  various  disease  states. 


Distribution  of  Amino  Acids  At  Position 


151 

231 

196 

Across  all  Disease  States 

0.2639 

0.8914 

0.0624 

Gastritis  versus  all  other  disease  states  (duodenal  ulcers,  gastric  ulcers,  and  gastric  cancers) 

1.0000 

0.4694 

0.3916 

Gastritis  versus  peptic  ulcers  (both  duodenal  and  gastric) 

0.7700 

0.6418 

0.8808 

Gastritis  versus  duodenal  ulcers 

0.2818 

0.8529 

0.4796 

Gastritis  versus  gastric  ulcers 

0.4627 

0.6946 

0.2500 

Gastritis  versus  gastric  cancers 

0.4044 

0.5132 

0.0460 

Duodenal  ulcers  versus  all  other  disease  states 

0.1025 

1.0000 

0.1237 
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Duodenal  ulcers  versus  gastric  ulcers  0. 1306  1 .0000 

Duodenal  ulcers  versus  gastric  cancer  0.0994  0.8015 


Gastric  ulcers  versus  all  other  disease  states  0.3847  0.8533 

Gastric  ulcers  versus  gastric  cancer  1.0000  1.0000 


Gastric  cancer  versus  all  other  gastric  diseases  0.3207  0.6732 

Gastric  cancer  versus  peptic  ulcers  0.2910  0.8201 


Associations  that  were  statistically  significant  are  in  bold  script  with  the  corresponding  P  value  shaded  in  gray. 


0.1250 

0.0254 


0.3636 

0.3994 


0.0389 

0.0689 
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development,  but  that  the  overall  impact  is  masked  by  the  progression  of  this  disease 
through  gastric  ulcers. 

Higher  Order  Associations 

Log  linear  modeling  using  a  combination  of  available  data  revealed  two  direct 
three  way  associations:  disease  state,  cagA  allele,  and  variation  at  amino  acid  23 1 
(P=0.012),  and  disease  state  and  variation  at  amino  acids  196  and  231  (P=0.029).  As 
mentioned  above,  we  found  no  direct  association  with  variation  of  amino  acid  23 1  and 
disease  state  in  our  population  (P=0.8914).  However,  we  noted  that  the  two  direct  three 
way  associations  described  above  involved  disease  state  and  amino  acid  231,  or  disease 
state,  this  amino  acid,  and  the  cagA  allele.  We  therefore  asked  if  the  presence  of  a 
Western  or  East  Asian  cagA  allele  affected  the  ability  of  variation  at  residue  23 1  to  be 
associated  with  disease  progression.  We  found  that  a  two  way  association  did  exist 
between  disease  state  and  amino  acid  23 1  but  only  within  the  non  EPIYA-ABD 
population  (P=0.0367).  This  again  suggests  that  the  effect  of  different  virulence  factors 
or  polymorphisms  within  these  virulence  factors  may  be  masked  by  which  cagA  allele  is 
present. 


Discussion 

Polymorphisms  within  vacA  have  been  studied  for  several  years,  but  have 
primarily  focused  on  the  s  and  m  regions.  However,  the  newest  identified  polymorphic 
region  of  vacA,  the  i  region,  has  been  suggested  to  be  a  determinant  of  vacuolating 
activity,  as  well  as  the  best  indicator  of  disease  pathology,  at  least  in  Western  strains  (2, 
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16,  47).  Within  this  region,  three  clusters  of  polymorphisms  have  been  reported:  cluster 
A,  B,  and  C  (47).  In  fact,  two  different  amino  acid  substitutions  within  il  clusters  have 
been  identified  as  potential  marker  for  Taiwanese  VacA  (50).  However,  as  previously 
suggested  by  our  group  (27)  both  of  these  amino  acid  substitutions  are  conserved  within 
the  South  Korean  population:  we  found  a  phenylalanine  instead  of  a  tyrosine  for  the  first 
amino  acid  in  cluster  A  (94.4%)  and  a  methionine  for  the  second  asparagine  within 
cluster  C  (99.5%).  This  suggests  that  these  amino  acids  substitution  are  not  limited  to  the 
Taiwanese  population,  but  could  serve  as  a  general  marker  for  East  Asian  VacA. 

Further  analysis  of  the  amino  acid  consensus  sequence  of  clusters  A,  B,  and  C, 
revealed  several  differences  from  the  consensus  sequence  previously  reported  by  Rhead 
et  al.  (47)  and  Chung  et  al.  (11).  This  South  Korean  population  provides  the  largest 
number  of  VacA  i  sequences  analyzed  to  date,  allowing  us  to  better  identify  amino  acids 
that  differ  between  the  il  and  i2  alleles  (Fig.  12).  Cluster  A  was  well  conserved  across 
strains  and  the  il  and  i2  consensus  sequences  were  very  similar  to  one  another. 
Interestingly,  each  of  the  three  i3  strains  that  we  identified  showed  the  presence  of 
differences  in  cluster  A;  one  of  the  strains  contained  the  exact  il  consensus  sequence,  one 
an  exact  i2  consensus  sequence,  and  that  one  was  completely  different  (Fig.  12), 
indicating  that  these  strains  may  be  in  the  process  of  evolving  from  one  allele  into  the 
other  allele. 

Within  cluster  B,  the  fifth,  ninth  and  the  tenth  amino  acids  represent  the  main 
difference  between  the  il  and  i2  alleles.  The  ninth  amino  acid  (residue  23 1)  was 
previously  shown  to  have  a  role  in  disease  (50).  We  found  this  residue  was  not  variable 
in  the  small  subset  of  i2  strains  within  our  population,  though  it  was  variable  within  the  i  1 
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strains.  Moreover,  this  variability  at  amino  acid  23 1  does  impact  disease  development, 
but  only  within  strains  containing  a  non  EPIYA-ABD  CagA.  This  finding  suggests  that 
within  strains  carrying  the  most  virulent  cagA  allele,  this  residue  is  less  important. 

The  differences  between  the  il  and  i2  alleles  were  most  pronounced  in  cluster  C. 
This  is  primarily  due  to  the  addition  of  three  polar  amino  acids,  asparagine,  histidine,  and 
serine,  but  there  were  an  additional  five  amino  acids  that  differ  between  the  i  1  and  i2 
alleles  (Fig.  12).  Since  both  clusters  B  and  C  have  been  suggested  to  affect  toxin  activity, 
studies  directed  towards  examination  of  the  specific  role  of  these  amino  acids  in 
vacuolating  activity  would  be  of  interest.  Creation  and  examination  of  the  activity  of 
isogenic  toxin  derivatives  varying  only  in  the  residues  of  interest  would  aide  in  this 
effort.  Furthennore,  given  that  the  i3  strains  appear  to  be  hybrids  of  the  il  and  i2  alleles 
at  cluster  B  and  C,  it  would  be  of  significant  interest  to  determine  if  there  is  a  functional 
difference  between  vacuolating  activity  of  il,  i2,  and  i3  toxins.  To  further  investigate  the 
role  of  these  clusters,  toxin  activity  could  be  assessed  among  our  identified  i3  strains,  as 
well  as  those  that  contain  an  il  cluster  B  and  i2  cluster  C.  Finally,  even  though  it  has  not 
been  suggested  to  be  important  for  activity,  it  would  also  be  interesting  to  assess  the 
sequences  from  cluster  A  to  analyze  the  variance  of  amino  acids  four  and  six,  potentially 
providing  insight  into  VacA  evolution. 

Interestingly,  five  possible  different  amino  acids  were  found  to  occur  at  amino 
acid  231  (the  ninth  amino  acid  of  cluster  B):  glycine,  serine,  arginine,  asparagine,  and 
aspartic  acid.  Of  these,  glycine  is  the  only  non  polar  amino  acid  within  this  group, 
suggesting  that  perhaps  this  residue  is  important  for  the  conformation  or  folding  of  VacA. 
Indeed,  this  could  explain  the  statistical  association  between  the  distribution  of  amino 
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acids  at  this  position  with  every  region  of  VacA.  Previous  work  showed  that  a  glycine 
found  at  amino  acid  231  was  linked  to  disease  development  within  the  Taiwanese 
population  (50).  However,  among  our  South  Korean  population,  the  distribution  of 
amino  acids  found  at  this  position  had  no  overall  impact  on  disease  development. 
Conversely,  there  was  a  strong  association  with  variation  at  this  residue  and  the  cagA 
allele,  as  well  as  two  direct  three-way  associations  between  disease  state,  the  distribution 
of  amino  acids  at  this  position,  and  either  cagA  allele  or  amino  acid  196,  both  of  which 
affect  the  development  of  cancer  in  this  population  (Table  10;  27,  29).  Given  the 
association  with  cagA,  the  impact  of  this  amino  acid  across  East  Asian  and  Western  cagA 
alleles  was  examined,  and  a  direct  two  way  association  between  amino  acid  23 1  and 
disease  state  was  found  among  the  strains  that  carry  non  EPIYA-ABD  cagA  alleles.  This 
emphasizes  the  importance  of  the  i  region  when  carried  within  the  context  of  Western 
strains,  and  suggests  that  there  may  be  other  factors  that  are  more  important  in  disease 
development  or  that  mask  the  importance  of  the  i  region  in  severe  disease  development 
among  East  Asian  strains.  Indeed,  even  among  Western  strains,  the  cagA  allele  was  the 
most  important  virulence  factor  for  development  of  gastric  cancer,  whereas  the  VacA  i 
region  was  the  best  indicator  for  development  of  peptic  ulcer  disease  (5).  Further 
exploration  of  differences  between  Western  and  East  Asian  strains  may  help  to  explain 
the  exact  mechanism  of  interaction  of  CagA  and  VacA. 

Variation  at  amino  acid  196  was  statistically  linked  only  to  the  vac  A  i  region, 
indicating  that  it  may  be  a  true  indicator  of  i  region  associated  impacts.  While  overall 
this  amino  acid  was  not  linked  to  disease  state  nor  was  there  an  association  between  the 
less  severe  disease  states  and  distribution  of  amino  acids  at  position  196,  there  was  a 
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statistical  difference  in  the  distribution  of  amino  acids  at  this  position  when  gastric  cancer 
was  compared  against  any  other  disease  state,  gastritis,  or  duodenal  ulcers.  However, 
there  was  no  statistical  difference  in  the  distribution  of  amino  acids  at  this  position 
between  gastric  cancer  and  gastric  ulcers.  This  may  be  due  to  the  fact  that  gastric  ulcers 
can  be  a  precursor  to  gastric  cancer  (19,  36).  Given  this,  more  severe  disease  states 
(gastric  cancer  and  ulcers)  were  next  compared  to  less  severe  disease  manifestations 
(duodenal  ulcers  and  gastritis),  and  a  statistical  association  was  identified.  This  suggests 
that  variation  at  residue  196  is  important  for  progression  to  severe  disease,  and  that  the 
overall  significance  of  the  amino  acids  found  at  this  position  is  probably  masked  by  the 
lack  of  association  between  gastric  cancer  and  gastric  ulcers. 

While  a  serine  at  this  position  was  more  prevalent  across  the  overall  population, 
patients  suffering  from  gastric  cancer  were  five  times  more  likely  to  carry  a  serine  at  this 
location.  However,  all  of  the  i2  alleles  carried  a  leucine  at  this  position,  and  if  this  trend 
holds  true  for  a  larger  population  of  i2  strains,  then  its  contribution  to  disease 
development  may  become  more  evident.  Since,  the  distribution  of  amino  acids  at  this 
position  was  not  only  linked  to  the  overall  Vac  allele,  but  specifically  to  the  i  region,  the 
different  prevalence  of  serine  found  at  this  position  may  explain  why  a  previous  study 
concluded  that  the  i  region  was  the  best  predictor  of  disease  (47). 

Since  both  CagA  and  Vac  A  polymorphisms  influence  disease  development, 
perhaps  it  is  not  surprising  that  within  a  population  of  isolates  from  a  country  with  one  of 
the  highest  rates  of  H.  pylori  colonization  and  gastric  cancer  (21,51,55),  the  majority  of 
strains  encode  for  the  most  toxic  form  of  both  CagA  and  VacA.  While  polymorphisms 
important  for  disease  severity  have  been  identified  within  both  CagA  and  VacA 
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individually,  strains  that  are  CagA  positive  and  carry  the  VacA  si /ml  allele  have  been 
shown  to  induce  highly  active  corpus  gastritis,  which  has  been  associated  with  the 
progression  to  gastric  cancer  (37-39).  Additionally,  we  recently  presented  evidence  that 
within  this  high  risk  population  for  gastric  cancer  development,  there  is  a  significant 
interaction  between  the  vacA  allele,  cagA  allele,  and  disease  state  (27).  However,  the 
reason  only  a  small  percentage  of  the  population  develops  cancer  is  still  unclear.  Though 
it  is  evident  that  host,  environmental,  and  bacterial  factors  play  a  role  in  H.  pylori- 
induced  disease  (reviewed  in  8,  57),  additional  studies  are  required  to  determine  the 
contribution  of  all  of  these  factors,  both  individually  and  in  conjunction  with  each  other, 
to  the  development  of  H.  pylori- induced  gastric  cancer. 
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Table  S3:  Complete  Korean  Collection 


Strain 

Disease 

Sex 

Age 

CagA  EPIYA  motif 

casA  accession 
number 

vacA  allele 

vacA  accession 
number 

Kl-CA 

Cancer 

F 

68 

ABD 

FJ4581 17 

sl/il/m2 

GQ338184 

K2-CA 

Cancer 

F 

64 

ABD 

sl/il/ml 

GQ338205 

K3-CA 

Cancer 

F 

65 

ABD 

FJ4581 18 

sl/il/ml 

GQ338222 

K4-CA 

Cancer 

F 

37 

ABD 

sl/il/ml 

GQ338225 

K5-CA 

Cancer 

M 

70 

ABD 

sl/il/ml 

GQ338227 

K6-CA 

Cancer 

F 

45 

ABD 

sl/il/ml 

GQ338233 

K7-CA 

Cancer 

M 

56 

ABD 

sl/il/ml 

GQ338235 

K8-CA 

Cancer 

M 

56 

ABD 

sl/il/ml 

GQ338239 

K9-CA 

Cancer 

M 

58 

ABD 

FJ4581 19 

sl/il/ml 

GQ338242 

K10-CA 

Cancer 

M 

52 

ABD 

sl/il/ml 

HM047617 

Kll-CA 

Cancer 

M 

68 

ABD 

sl/il/ml 

HM047618 

K12-G 

Gastritis 

F 

52 

ABD 

sl/il/ml 

HM047619 

K13-CA 

Cancer 

M 

38 

ABD 

sl/il/ml 

HM047620 

K14-CA 

Cancer 

F 

78 

ABD 

sl/il/ml 

HM047621 

K15-CA 

Cancer 

F 

66 

ABD 

FJ458120 

sl/il/ml 

HM047622 

K16-CA 

Cancer 

M 

48 

ABD 

sl/il/ml 

GQ338194 

K17-CA 

Cancer 

F 

56 

ABD 

sl/il/ml 

GQ338196 

K18-CA 

Cancer 

M 

64 

ABD 

sl/il/ml 

HM047623 

K19-CA 

Cancer 

F 

86 

ABD 

sl/i3/ml 

HM047624 

K20-CA 

Cancer 

M 

48 

ABD 

sl/il/ml 

HM047625 

K21-CA 

Cancer 

M 

44 

ABD 

sl/il/ml 

GQ338209 
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K22-GU 

Gastric  Ulcer 

M 

42 

ABD 

sl/il/ml 

HM047626 

K23-DU 

Duodenal  Ulcer 

M 

47 

ABD 

sl/i3/ml 

GQ338210 

K24-DU 

Duodenal  Ulcer 

M 

38 

ABD 

FJ458121 

sl/il/ml 

HM047627 

K25-DU 

Duodenal  Ulcer 

M 

44 

ABD 

sl/il/ml 

GQ338214 

K26-DU 

Duodenal  Ulcer 

M 

20 

ABD 

sl/il/ml 

GQ338218 

K27-DU 

Duodenal  Ulcer 

M 

47 

ABD 

sl/il/ml 

HM047628 

K28-DU 

Duodenal  Ulcer 

M 

28 

ABD 

sl/il/ml 

HM047629 

K29-DU 

Duodenal  Ulcer 

F 

57 

ABD 

sl/il/ml 

HM047630 

K30-DU 

Duodenal  Ulcer 

F 

61 

ABCCC 

FJ458122 

sl/i2/m2 

HM047631 

K31-DU 

Duodenal  Ulcer 

M 

33 

ABD 

sl/il/ml 

HM047632 

K32-DU 

Duodenal  Ulcer 

F 

41 

ABD 

sl/il/ml 

HM047633 

K33-DU 

Duodenal  Ulcer 

F 

31 

ABC 

FJ458123 

sl/il/ml 

HM047634 

K34-DU 

Duodenal  Ulcer 

M 

43 

ABD 

FJ458124 

sl/il/ml 

GQ338223 

K35-DU 

Duodenal  Ulcer 

F 

56 

ABD 

sl/il/m2 

HM047635 

K36-DU 

Duodenal  Ulcer 

M 

46 

ABD 

sl/il/ml 

HM047636 

K37-DU 

Duodenal  Ulcer 

M 

61 

ABD 

sl/i2/m2 

HM047637 

K38-DU 

Duodenal  Ulcer 

F 

39 

ABC 

sl/il/ml 

GQ338224 

K39-DU 

Duodenal  Ulcer 

M 

59 

ABD 

sl/il/ml 

HM047638 

K40-DU 

Duodenal  Ulcer 

M 

53 

ABD 

sl/il/ml 

HM047639 

K41-DU 

Duodenal  Ulcer 

M 

55 

ABD 

sl/il/ml 

HM047640 

K42-DU 

Duodenal  Ulcer 

F 

48 

ABD 

sl/il/m2 

HM047641 

K43-DU 

Duodenal  Ulcer 

M 

70 

ABD 

sl/il/ml 

HM047642 

K44-DU 

Duodenal  Ulcer 

M 

42 

ABD 

sl/il/ml 

HM047643 

K45-DU 

Duodenal  Ulcer 

M 

22 

ABD 

FJ458125 

sl/il/ml 

HM047644 
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K46-DU 

Duodenal  Ulcer 

F 

61 

ABD 

sl/il/ml 

HM047645 

K47-DU 

Duodenal  Ulcer 

F 

72 

ABD 

sl/i2/m2 

HM047646 

K48-DU 

Duodenal  Ulcer 

F 

41 

ABD 

sl/il/ml 

HM047647 

K49-DU 

Duodenal  Ulcer 

M 

33 

ABC 

sl/il/ml 

GQ338226 

K50-DU 

Duodenal  Ulcer 

M 

35 

ABD 

sl/il/ml 

HM047648 

K51-GU 

Gastric  Ulcer 

M 

54 

ABD 

sl/il/ml 

GQ338228 

K52-GU 

Gastric  Ulcer 

M 

46 

ABD 

sl/il/ml 

GQ338229 

K53-G 

Gastritis 

M 

60 

ABD 

sl/il/ml 

HM047649 

K54-G 

Gastritis 

F 

58 

ABD 

FJ458126 

sl/il/ml 

HM047650 

K55-GU 

Gastric  Ulcer 

F 

57 

ABD 

sl/il/ml 

GQ338230 

K56-G 

Gastritis 

F 

48 

ABD 

sl/il/ml 

HM047651 

K57-G 

Gastritis 

F 

63 

ABD 

sl/il/ml 

GQ338231 

K58-G 

Gastritis 

F 

61 

ABD 

sl/il/ml 

GQ338232 

K59-G 

Gastritis 

M 

48 

BBD 

FJ458127 

sl/il/ml 

HM047652 

K60-G 

Gastritis 

M 

53 

ABC 

sl/il/ml 

HM047653 

K61-GU 

Gastric  Ulcer 

F 

57 

* 

* 

K62-GU 

Gastric  Ulcer 

F 

65 

* 

* 

K63-GU 

Gastric  Ulcer 

M 

59 

* 

* 

K64-G 

Gastritis 

F 

61 

ABCC 

FJ458128 

sl/il/ml 

HM047654 

K65-G 

Gastritis 

M 

49 

ABD 

sl/il/ml 

HM047655 

K66-G 

Gastritis 

M 

43 

ABC 

sl/i2/m2 

HM047656 

K67-DU 

Duodenal  Ulcer 

F 

57 

ABD 

sl/il/ml 

HM047657 

K68-GU 

Gastric  Ulcer 

F 

46 

ABD 

sl/il/ml 

HM047658 

K69-GU 

Gastric  Ulcer 

M 

63 

ABD 

FJ458129 

sl/il/ml 

GQ338234 
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K70-G 

Gastritis 

F 

68 

ABC 

FJ458130 

sl/i2/m2 

HM047659 

K71-G 

Gastritis 

M 

54 

ABD 

sl/il/ml 

HM047660 

K72-GU 

Gastric  Ulcer 

M 

34 

ABD 

sl/il/ml 

GQ338236 

K73-GU 

Gastric  Ulcer 

M 

72 

ABD 

sl/il/ml 

HM047661 

K74-G 

Gastritis 

F 

52 

ABD 

sl/il/ml 

GQ338237 

K75-G 

Gastritis 

F 

24 

ABD 

sl/il/ml 

HM047662 

K76-G 

Gastritis 

F 

55 

ABD 

sl/il/ml 

HM047663 

K77-G 

Gastritis 

F 

37 

ABD 

sl/il/ml 

GQ338238 

K78-G 

Gastritis 

M 

36 

AABD 

FJ458131 

sl/il/ml 

HM047664 

K79-GU 

Gastric  Ulcer 

F 

84 

ABD 

sl/il/ml 

HM047665 

K80-CA 

Cancer 

F 

61 

ABD 

sl/il/ml 

GQ338240 

K81-GU 

Gastric  Ulcer 

M 

47 

ABD 

sl/il/ml 

GQ338241 

K82-G 

Gastritis 

M 

39 

ABD 

FJ458132 

sl/il/ml 

K83-G 

Gastritis 

F 

75 

ABD 

sl/il/ml 

HM047666 

K84-G 

Gastritis 

M 

48 

ABD 

sl/il/ml 

HM047667 

K85-G 

Gastritis 

F 

28 

BD 

sl/il/ml 

HM047668 

K86-G 

Gastritis 

M 

37 

ABCC 

sl/il/ml 

HM047669 

K87-G 

Gastritis 

F 

52 

ABD 

sl/il/ml 

HM047670 

K88-G 

Gastritis 

F 

69 

ABD 

sl/il/ml 

HM047671 

K89-GU 

Gastric  Ulcer 

M 

38 

ABD 

sl/il/ml 

HM047672 

K90-GU 

Gastric  Ulcer 

M 

51 

ABD 

sl/il/ml 

HM047673 

K91-GU 

Gastric  Ulcer 

M 

82 

ABD 

sl/il/ml 

HM047674 

K92-GU 

Gastric  Ulcer 

M 

41 

ABD 

sl/il/ml 

HM047675 

K93-DU 

Duodenal  Ulcer 

F 

37 

ABC 

FJ458133 

sl/il/ml 

GQ338243 
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K94-GU 

Gastric  Ulcer 

M 

65 

ABD 

sl/il/ml 

HM047676 

K95-CA 

Cancer 

F 

41 

ABD 

sl/il/ml 

HM047677 

K96-G 

Gastritis 

F 

47 

ABD 

sl/il/ml 

HM047678 

K97-GU 

Gastric  Ulcer 

M 

51 

ABD 

sl/il/ml 

HM047679 

K98-DU 

Duodenal  Ulcer 

M 

23 

ABD 

sl/il/ml 

HM047680 

K99-G 

Gastritis 

F 

54 

ABD 

sl/i2/m2 

HM047681 

K100-GU 

Gastric  Ulcer 

M 

46 

ABD 

sl/il/ml 

HM047682 

K101-GU 

Gastric  Ulcer 

F 

61 

ABD 

sl/il/ml 

HM047683 

K102-DU 

Duodenal  Ulcer 

M 

38 

ABD 

sl/il/ml 

HM047684 

K103-G 

Gastritis 

M 

32 

ABD 

sl/i3/ml 

HM047685 

K104-CA 

Cancer 

M 

46 

ABD 

sl/il/ml 

GQ338185 

K105-GU 

Gastric  Ulcer 

M 

71 

ABD 

sl/il/ml 

HM047686 

K106-DU 

Duodenal  Ulcer 

M 

14 

ABD 

sl/il/ml 

HM047687 

K107-DU 

Duodenal  Ulcer 

M 

26 

ABD 

sl/il/ml 

HM047688 

K108-GU 

Gastric  Ulcer 

M 

62 

ABD 

sl/il/ml 

HM047689 

K109-G 

Gastritis 

M 

40 

ABD 

sl/il/ml 

GQ338186 

KllO-GU 

Gastric  Ulcer 

M 

81 

ABCC 

FJ458134 

sl/il/ml 

HM047690 

Klll-DU 

Duodenal  Ulcer 

F 

36 

ABD 

sl/il/ml 

GQ338187 

K112-G 

Gastritis 

M 

57 

ABD 

sl/il/ml 

GQ338188 

K113-G 

Gastritis 

M 

29 

ABD 

sl/il/ml 

HM047691 

K114-DU 

Duodenal  Ulcer 

F 

47 

ABC 

sl/il/ml 

HM047692 

K115-G 

Gastritis 

F 

82 

ABC 

FJ458135 

sl/il/m2 

HM047693 

K116-G 

Gastritis 

M 

59 

ABD 

sl/il/ml 

HM047694 

K117-G 

Gastritis 

F 

21 

ABD 

FJ458136 

sl/il/ml 

GQ338189 

289 


K118-CA 

Cancer 

F 

67 

ABD 

sl/il/ml 

HM047695 

K119-DU 

Duodenal  Ulcer 

M 

31 

ABD 

FJ458137 

sl/il/ml 

HM047696 

K120-G 

Gastritis 

F 

41 

ABC 

sl/il/ml 

GQ338190 

K121-GU 

Gastric  Ulcer 

M 

42 

ABD 

sl/il/ml 

HM047697 

K122-DU 

Duodenal  Ulcer 

M 

60 

ABD 

sl/il/ml 

HM047698 

K123-G 

Gastritis 

M 

76 

ABD 

FJ458138 

sl/il/ml 

GQ338191 

K125-G 

Gastritis 

M 

59 

ABD 

sl/il/ml 

HM047699 

K126-GU 

Gastric  Ulcer 

M 

69 

ABD 

sl/il/ml 

HM047700 

K127-GU 

Gastric  Ulcer 

M 

71 

ABD 

sl/il/ml 

HM047701 

K128-GU 

Gastric  Ulcer 

M 

58 

ABC 

sl/il/ml 

GQ338192 

K129-GU 

Gastric  Ulcer 

M 

36 

* 

* 

K130-G 

Gastritis 

F 

64 

* 

* 

K131-G 

Gastritis 

F 

61 

ABD 

FJ458139 

sl/il/ml 

GQ338193 

K132-GU 

Gastric  Ulcer 

M 

23 

* 

* 

K133-GU 

Gastric  Ulcer 

M 

63 

* 

* 

K134-GU 

Gastric  Ulcer 

M 

46 

* 

* 

K135-DU 

Duodenal  Ulcer 

F 

62 

* 

* 

K136-G 

Gastritis 

F 

52 

* 

* 

K137-G 

Gastritis 

F 

62 

* 

* 

K138-GU 

Gastric  Ulcer 

F 

21 

* 

* 

K139-GU 

Gastric  Ulcer 

F 

49 

* 

* 

K140-G 

Gastritis 

F 

49 

* 

* 

K141-G 

Gastritis 

M 

57 

* 

* 

K142-GU 

Gastric  Ulcer 

M 

65 

* 

* 
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K143-GU 

Gastric  Ulcer 

F 

71 

* 

* 

K144-GU 

Gastric  Ulcer 

F 

53 

* 

* 

K145-GU 

Gastric  Ulcer 

M 

62 

* 

* 

K146-G 

Gastritis 

M 

40 

ABD** 

FJ458140 

sl/il/ml 

HM047702 

K147-GU 

Gastric  Ulcer 

M 

62 

* 

* 

K148-GU 

Gastric  Ulcer 

M 

37 

* 

* 

K149-GU 

Gastric  Ulcer 

M 

71 

* 

* 

K150-G 

Gastritis 

F 

26 

ABD 

sl/il/ml 

HM047703 

K151-GU 

Gastric  Ulcer 

M 

65 

ABD 

FJ458141 

sl/il/ml 

HM047704 

K152-G 

Gastritis 

F 

62 

ABD 

sl/il/ml 

HM047705 

K153-GU 

Gastric  Ulcer 

M 

42 

ABD 

sl/il/ml 

HM047706 

K154-G 

Gastritis 

F 

55 

ABCCCC 

FJ458142 

sl/il/m2 

HM047707 

K155-DU 

Duodenal  Ulcer 

N/A 

N/A 

ABD 

sl/il/ml 

HM047708 

K156-DU 

Duodenal  Ulcer 

F 

47 

ABD 

sl/il/ml 

HM047709 

K157-G 

Gastritis 

M 

43 

ABD 

sl/il/ml 

HM047710 

K158-G 

Gastritis 

M 

60 

ABD 

sl/il/ml 

HM04771 1 

K159-G 

Gastritis 

F 

35 

ABD 

sl/il/ml 

HM047712 

K160-DU 

Duodenal  Ulcer 

M 

30 

ABD 

sl/il/ml 

HM047713 

K161-G 

Gastritis 

F 

65 

ABD 

sl/il/ml 

HM047714 

K162-G 

Gastritis 

F 

63 

ABD 

sl/il/ml 

GQ338195 

K163-G 

Gastritis 

F 

66 

ABD 

sl/il/ml 

HM047715 

K164-G 

Gastritis 

M 

43 

ABD 

sl/il/ml 

HM047716 

K165-G 

Gastritis 

M 

28 

ABD 

sl/il/ml 

HM047717 

K166-G 

Gastritis 

F 

38 

ABC 

sl/i2/m2 

HM047718 

K167-G 

Gastritis 

F 

27 

ABD 

sl/il/ml 

HM047719 

K169-G 

Gastritis 

F 

47 

ABD 

sl/il/ml 

HM047720 

vo 


K170-G 

Gastritis 

F 

41 

ABD** 

FJ458143 

sl/il/ml 

HM047721 

K171-CA 

Cancer 

F 

72 

ABD 

FJ458144 

sl/il/ml 

HM047722 

K172-G 

Gastritis 

F 

31 

ABCC 

FJ458145 

sl/il/ml 

HM047723 

K173-G 

Gastritis 

F 

45 

ABD 

FJ458146 

sl/il/ml 

HM047724 

K174-G 

Gastritis 

N/A 

N/A 

ABD 

sl/il/ml 

HM047725 

K175-G 

Gastritis 

F 

41 

ABD 

sl/il/m2 

HM047726 

K176-DU 

Duodenal  Ulcer 

N/A 

N/A 

ABD 

sl/il/ml 

HM047727 

K177-G 

Gastritis 

F 

39 

ABD 

sl/il/ml 

HM047728 

K178-G 

Gastritis 

F 

40 

ABD 

sl/il/ml 

GQ338197 

K179-G 

Gastritis 

F 

38 

ABCCC 

FJ458147 

sl/il/m2 

HM047729 

K180-G 

Gastritis 

(polyps) 

F 

50 

ABD 

sl/il/ml 

HM047730 

K181-DU 

Duodenal  Ulcer 

F 

57 

ABD 

sl/il/m2 

HM047731 

K182-DU 

Duodenal  Ulcer 

N/A 

N/A 

ABD 

sl/il/ml 

GQ338198 

K183-G 

Gastritis 

M 

40 

ABD 

sl/il/ml 

GQ338199 

K184-G 

Gastritis 

F 

55 

ABD 

sl/il/ml 

HM047732 

K185-G 

Gastritis 

F 

52 

ABD 

sl/il/ml 

GQ338200 

K186-G 

Gastritis 

M 

41 

ABD 

sl/il/ml 

HM047733 

K188-G 

Gastritis  (IM) 

F 

43 

ABD 

sl/il/ml 

HM047734 

K190-G 

Gastritis 

M 

61 

ABC 

sl/il/ml 

GQ338201 

K192-DU 

Duodenal  Ulcer 

F 

61 

AABD 

FJ458148 

sl/il/ml 

HM047735 

K193-G 

Gastritis 

F 

50 

ABD 

sl/il/ml 

GQ338202 

K194-GU 

Gastric  Ulcer 

M 

48 

* 

sl/il/ml 

HM047564 

K195-GU 

Gastric  Ulcer 

F 

48 

ABD 

FJ458149 

sl/il/ml 

HM047565 

K196-G 

Gastritis 

F 

50 

ABD 

sl/il/ml 

GQ338203 

K197-G 

Gastritis 

F 

45 

ABD 

sl/il/ml 

GQ338204 
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K198-GU 

Gastric  Ulcer 

F 

56 

ABC 

FJ458150 

sl/i2/m2 

HM047566 

K199-GU 

Gastric  Ulcer 

M 

50 

ABD 

sl/il/ml 

HM047567 

K200-GU 

Gastric  Ulcer 

M 

63 

ABD 

sl/il/ml 

HM047568 

K201-GU 

Gastric  Ulcer 

M 

55 

ABD 

sl/il/ml 

HM047569 

K202-GU 

Gastric  Ulcer 

M 

41 

ABD 

sl/il/ml 

HM047570 

K203-G 

Gastritis 

F 

55 

ABD 

sl/il/ml 

HM047571 

K204-GU 

Gastric  Ulcer 

F 

63 

ABD 

sl/il/m2 

HM047572 

K205-GU 

Gastric  Ulcer 

F 

57 

ABD 

sl/il/ml 

HM047573 

K206-GU 

Gastric  Ulcer 

F 

51 

ABD 

sl/il/ml 

HM047574 

K207-G 

Gastritis 

M 

39 

ABD 

* 

GQ338206 

K208-G 

Gastritis 

F 

56 

ABD 

FJ458151 

sl/il/ml 

GQ338207 

K209-G 

Gastritis 

F 

24 

ABD 

sl/il/ml 

GQ338208 

K210-G 

Gastritis 

F 

61 

ABD 

sl/il/ml 

HM047575 

K211-G 

Gastritis 

F 

54 

ABD 

sl/il/ml 

HM047576 

K212-G 

Gastritis 

F 

45 

ABD 

sl/il/ml 

HM047577 

K213-G 

Gastritis 

F 

52 

* 

* 

K214-G 

Gastritis 

F 

53 

* 

* 

K215-GU 

Gastric  Ulcer 

F 

73 

* 

* 

K216-G 

Gastritis 

F 

67 

ABD 

sl/il/ml 

HM047578 

K217-G 

Gastritis 

M 

77 

ABD 

* 

K218-G 

Gastritis 

F 

62 

ABD 

sl/il/ml 

HM047579 

K219-G 

Gastritis 

M 

40 

ABD 

FJ458152 

sl/il/ml 

HM047580 

K220-DU 

Duodenal  Ulcer 

N/A 

N/A 

ABD 

sl/il/ml 

HM047581 

K221-G 

Gastritis 

M 

37 

BD 

sl/il/ml 

HM047582 

K222-GU 

Gastric  Ulcer 

M 

41 

ABD 

sl/il/ml 

HM047583 

K223-G 

Gastritis 

F 

25 

ABD 

FJ458153 

sl/il/ml 

HM047584 
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K224-G 

Gastritis 

F 

35 

ABD 

sl/il/ml 

HM047585 

K225-DU 

Duodenal  Ulcer 

F 

60 

ABC 

sl/il/ml 

HM047586 

K226-GU 

Gastric  Ulcer 

M 

42 

ABD 

sl/il/ml 

HM047587 

K227-G 

Gastritis 

F 

31 

ABD 

sl/il/ml 

HM047588 

K228-GU 

Gastric  Ulcer 

M 

54 

ABD 

sl/il/ml 

HM047589 

K229-GU 

Gastric  Ulcer 

M 

62 

ABD 

sl/il/ml 

HM047590 

K230-G 

Gastritis  (IM) 

M 

56 

ABD 

sl/il/ml 

HM047591 

K231-GU 

Gastric  Ulcer 

M 

42 

ABD 

sl/il/ml 

HM047592 

K232-G 

Gastritis 

F 

56 

ABD 

* 

K233-G 

Gastritis 

M 

38 

ABD 

sl/il/ml 

HM047594 

K234-DU 

Duodenal  Ulcer 

M 

41 

ABD 

sl/il/m2 

HM047595 

K235-G 

Gastritis 

F 

50 

ABD 

sl/il/ml 

HM047596 

K236-G 

Gastritis 

F 

64 

ABD 

sl/il/ml 

HM047597 

K237-G 

Gastritis 

F 

48 

ABD 

sl/il/ml 

HM047598 

K238-DU 

Duodenal  Ulcer 

M 

55 

ABD 

sl/il/ml 

GQ33821 1 

K239-G 

Gastritis 

M 

46 

ABD 

* 

HM047599 

K240-G 

Gastritis 

F 

41 

ABD 

sl/il/ml 

HM047600 

K241-G 

Gastritis 

M 

41 

ABD 

sl/il/ml 

HM047601 

K242-G 

Gastritis 

M 

78 

ABD 

sl/il/ml 

HM047602 

K243-G 

Gastritis 

F 

60 

BC 

* 

K244-DU 

Duodenal  Ulcer 

N/A 

N/A 

ABD 

sl/il/ml 

HM047603 

K245-G 

Gastritis 

M 

19 

ABD 

sl/il/ml 

HM047604 

K246-G 

Gastritis 

F 

40 

ABD 

sl/il/ml 

HM047605 

K247-G 

Gastritis 

F 

56 

ABD 

sl/il/ml 

HM047606 

K248-G 

Gastritis 

M 

58 

ABD 

sl/il/ml 

GQ338212 

K249-GU 

Gastric  Ulcer 

F 

48 

ABC 

sl/il/ml 

GQ338213 
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K250-G 

Gastritis 

F 

53 

ABD 

sl/il/ml 

HM047607 

K251-DU 

Duodenal  Ulcer 

M 

70 

ABD 

sl/il/ml 

HM047608 

K253-DU 

Duodenal  Ulcer 

F 

61 

ABC 

sl/il/ml 

HM047609 

K254-G 

Gastritis 

M 

54 

ABD 

sl/il/ml 

HM047610 

K255-G 

Gastritis 

F 

74 

ABD 

FJ458154 

sl/il/ml 

GQ338215 

K256-G 

Gastritis 

F 

51 

ABD 

sl/il/ml 

HM04761 1 

K257-CA 

Cancer 

M 

64 

ABD 

sl/il/ml 

HM047612 

K258-CA 

Cancer 

M 

68 

ABD 

FJ458155 

sl/il/ml 

GQ338216 

K259-CA 

Cancer 

M 

44 

ABD 

FJ458156 

sl/il/m2 

GQ338217 

K260-CA 

Cancer 

M 

58 

ABD 

FJ458157 

sl/il/ml 

GQ338219 

K261-CA 

Cancer 

F 

48 

ABD 

FJ458158 

sl/il/ml 

GQ338220 

K262-G 

Gastritis 

F 

56 

ABC 

FJ458159 

sl/il/ml 

GQ338221 

K263-G 

Gastritis 

M 

59 

ABABD*** 

FJ458160 

sl/il/ml 

HM047613 

K264-DU 

Duodenal  Ulcer 

M 

32 

ABD 

FJ458161 

sl/il/ml 

HM047614 

K265-DU 

Duodenal  Ulcer 

M 

42 

ABD 

FJ458162 

sl/il/ml 

HM047615 

K266-G 

Gastritis 

F 

34 

ABD 

FJ458163 

sl/il/ml 

HM047616 

^Indeterminate  in  genotyping  assay 

**-B  motifs  proline  is  replaced  with  a  serine,  ESIYA,  therefore  classified  as  other 

***-ABABD  second  -B  motifs  proline  is  replaced  with  leucine,  ELIYA,  therefore  classified  as  other 

a  +  indicates  frameshift  in  the  i  region 


295 


Chapter  Five 


Construction  and  Analysis  of  Isogenic  Strains  o/Helicobacter  pylori  That  Differ  Only  in 

the  CagA  EPIYA  Motif 

Kathleen  R.  Jones,  Jeannette  M.  Whitmire,  Shana  Miles,  Sungil  Jang,  Jeong-Heon 
Cha,  and  D.  Scott  Merrell 

The  work  presented  in  this  chapter  is  the  sole  work  of  K.  R.  Jones  with  the 
following  exceptions:  J.H.  Cha  assisted  with  creation  of  the  isogenic  strains,  and  S.  Jang 
assisted  with  the  creation  of  the  new  restorant  strain.  S.  Miles  assisted  with  the  animal 
work,  and  J.M.  Whitmire  assisted  with  immunofluorescent  staining,  microscopy, 
morphological  assays,  and  figure  generation. 

Introduction 

The  process  of  H.  pylori- induced  pathogenesis,  including  development  of  gastric 
cancer,  is  not  well  understood.  This  is  despite  the  fact  that  progress  has  been  made  in 
elucidating  some  key  virulence  factors  that  impact  disease  progression.  In  fact,  a  couple 
of  virulence  factors  have  been  shown  to  have  profound  effects  via  the  deregulation  of 
host  cell  signaling  pathways.  The  most  well  characterized  of  these  virulence  factors  is  the 
cyto toxin-associated  gene  A,  cagA,  which  has  emerged  as  a  major  contributor  to  the 
development  of  gastric  cancer  (6,  18).  A  number  of  cagA  alleles  exist,  and  these  alleles 
differ  in  the  carboxy-terminus  of  the  encoded  protein.  Variation  specifically  occurs  in  the 
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EPIYA  region,  and  typically  involves  changes  in  the  amino  acid  sequences  flanking  the 
five-amino-acid  repeat  (19,  21,  22). 

As  mentioned  in  the  introduction  to  this  thesis,  the  sequence  surrounding  the 
EPIYA  motif  shows  divergence  across  strains,  which  has  led  to  the  classification  of  four 
distinct  EPIYA  motifs:  EPIYA-A,  -B,  -C,  and  -D  (21).  Western  CagA  contains  a 
combination  of  EPIYA-A,  -B,  and  -C  motifs  (isolates  with  up  to  five  EPIYA-C  motifs 
have  been  identified)  (3),  while  East  Asian  CagA  contains  a  combination  of  EPIYA-A,  - 
B,  and  -D  motifs  (11,21,  22,  28).  These  EPIYA  repeats  are  not  only  important  for  CagA 
phosphorylation  dependent  effects,  but  a  multimerization  domain  within  the  EPIYA 
repeat  region  also  exists  that  affects  downstream  signaling  pathways  in  a  phosphorylation 
independent  manner  (26). 

Numerous  lines  of  evidence  indicate  that  CagA  polymorphisms  may  dramatically 
impact  //.  pylori- induced  disease  etiology.  In  fact,  in  vitro  assays  have  demonstrated  a 
dose-dependent  response  in  the  levels  of  CagA  tyrosine  phosphorylation,  SHP-2  binding, 
and  host  cell  morphological  changes  that  occur  as  a  result  of  increasing  numbers  of 
EPIYA-C  motifs  (22).  There  are  also  differences  in  induced  inflammation  depending  on 
which  CagA  form  is  present  in  the  infecting  strain.  Epidemiological  studies  have  shown 
that  among  patients  infected  with  H.  pylori  strains  containing  Western  CagA,  increased 
inflammation  and  increased  disease  severity  correlate  with  an  increasing  number  of 
EPIYA-C  motifs  (30).  In  vitro  studies  comparing  East  Asian  CagA  to  Western  CagA 
have  also  shown  that  East- Asian  CagA  binds  SHP-2  with  greater  affinity  and  induces 
more  significant  morphological  changes  than  Western  CagA  containing  up  to  three 
EPIYA-C  motifs  (20,  21).  Furthermore,  epidemiological  studies  have  shown  that  there  is 
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significantly  more  inflammation  and  atrophy  in  patients  infected  with  H.  pylori  strains 
carrying  the  East  Asian  CagA  (5).  En  masse  these  data  indicate  that  differences  within 
the  EPIYA  motifs  of  CagA  impact  disease  development  (3-5,  14,  15,  30). 

While  this  infonnation  has  been  useful  and  strongly  suggests  that  differences  in 
CagA  impact  disease  development,  all  of  these  previous  studies  employed  epidemiology 
data,  used  non-isogenic  clinical  isolates,  or  relied  on  transfection  models.  While  the 
results  indicate  interesting  trends,  since  the  genetic  variability  of  H.  pylori  strains  is 
between  5-7%,  there  is  concern  that  work  with  nonisogenic  strans  may  not  be  a  true 
indicator  of  the  exact  role  of  the  EPIYA  motif  in  disease  development  (1,  17,  24,  29). 
Furthermore,  transfection  assays  likely  do  not  recapitulate  biological  delivery  of  CagA  by 
an  infecting  bacterium. 

The  goal  of  this  study  was  to  elucidate  the  role  of  the  various  EPIYA  motifs  in 
disease  development.  To  this  end  we  used  splicing  by  overlapping  extension  (SOE)  PCR 
and  transformation  to  create  isogenic  strains  of  H.  pylori  that  varied  only  in  the  EPIYA 
region  of  CagA.  These  strains  were  then  assessed  for  their  growth  kinetics,  ability  to 
express  CagA,  localization  differences  on  host  cells,  adherence  to  and  internalization  into 
host  cells,  ability  to  translocate  and  phosphorylate  CagA,  and  ability  to  deregulate  host 
cell  pathways  as  assessed  by  changes  in  host  cell  morphology.  Moreover,  since  a  recent 
Mongolian  gerbil  model  was  shown  to  reproducibly  develop  CagA-dependent  gastric 
cancer  within  a  period  of  6-12  weeks  after  infection  with//,  pylori  strain  7.13  (14,  15), 
we  used  this  model  to  investigate  the  role  of  the  different  EPIYA  motifs  in  disease 
progression.  Through  the  course  of  all  these  studies,  it  became  apparent  that  there  were 
secondary  mutations  within  the  isogenic  strains  that  would  prevent  our  ability  to 
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appropriately  compare  the  strains.  However,  the  methods  optimized  during  this  study 
should  be  useful  to  the  lab  in  the  future.  To  this  end,  this  chapter  details  the  optimized 
methods  for  each  of  the  assays  mentioned  above  and  briefly  mentions  the  results  that 
indicated  the  presence  of  secondary  mutations  in  our  strains. 

Materials  and  Methods  and  Results 
Bacterial  Strains  and  Growth  Conditions 

Bacterial  strains  and  plasmids  are  listed  in  Table  11,  and  primers  are  listed  in 
Table  12.  E.  coli  strains  were  maintained  as  frozen  (-80°C)  stocks  in  LB  broth  (MoBio, 
Carlsbad,  CA)  supplemented  with  40%  glycerol  (EMD  Chemicals,  Inc.,  Gibbstown,  NJ), 
and  expanded  on  LB  agar  (Mo  Bio,  Carlsbad,  CA)  plates  or  in  LB  broth  liquid  cultures. 
All  cultures  of  E.  coli  were  grown  at  37°C  and  liquid  cultures  were  maintained  with 
shaking  at  200  rpm.  H.  pylori  strains  were  maintained  as  frozen  (-80°C)  stocks  in  brain 
heart  infusion  broth  (BD,  Sparks,  MD)  supplemented  with  20%  glycerol  and  10%  fetal 
bovine  serum  (FBS;  Gibco,  Carlsbad,  CA).  All  //.  pylori  strains  were  grown  at  37°C  and 
expanded  on  antibiotic-supplemented  horse  blood  agar  plates  consisting  of  4%  Columbia 
agar  base  (Neogen  Corporation,  Lansing,  MI),  5%  defibrinated  horse  blood  (HemoStat 
Laboratories,  Dixon,  CA),  0.2%  [3-cyclodextrin  (Sigma,  St.  Louis,  MO),  8  pg/ml 
amphotericin  B  (Amresco,  Solon,  OH),  2.5  U/ml  polymyxin  B  (Sigma,  St.  Louis,  MO),  5 
pg/ml  cefsulodin  (Sigma,  St.  Louis,  MO),  5  pg/ml  trimethoprim  (Sigma,  St.  Louis,  MO), 
and  10  pg/ml  vancomycin  (Amresco,  Solon,  OH).  //.  pylori  liquid  cultures  consisted  of 
brucella  broth  (BB)  (Neogen  Corporation,  Lansing,  MI)  containing  10%  fetal  bovine 


Table  11:  Bacterial  strains  and  plasmids 


Strain  Name 

Organism 

EPIYA  motif 

Antibiotic  resistance 

Citation 

7.13 

H.  pylori 

EPIYA-ABtC 

(14,  15) 

DSM3 

E.  coli  DH5a 

Kan 

(10) 

DSM598 

E.  coli  Top  10 

Amp 

This  study 

DSM599 

E.  coli  Top  10 

Amp  &  Kan 

This  study 

DSM600 

H.  pylori 

A  cagA 

Kan 

This  study 

DSM530 

E.  coli  Top  10 

Amp 

This  study 

DSM531 

E.  coli  Top  10 

Amp  &  Kan 

This  study 

DSM577 

H.  pylori 

AEPIYA 

Kan 

This  study 

DSM926 

H.  pylori 

AEPIYA 

Kan 

This  study 

DSM591 

H.  pylori 

EPIYA- ABtCCCC 

(23) 

DSM532 

E.  coli  Top  10 

EPIYA-ABt 

Amp 

This  study 

DSM601 

H.  pylori 

EPIYA-ABt 

This  study 

DSM570 

E.  coli  Top  10 

EPIYA-ABtC 

Amp 

This  study 

DSM602 

H.  pylori 

EPIYA-ABtC 

This  study 

DSM571 

E.  coli  Top  10 

EPIYA- ABtCC 

Amp 

This  study 

DSM605 

H.  pylori 

EPIYA- ABtCC 

This  study 
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DSM572 

E.  coli  Top  10 

EPIYA-ABtCCC 

Amp 

This  study 

DSM606 

H.  pylori 

EPIYA-ABtCCC 

This  study 

DSM573 

E.  coli  Top  10 

EPIYA-ABtCCCC 

Amp 

This  study 

DSM609 

II.  pylori 

EPIYA-ABtCCCC 

This  study 

DSM590 

H.  pylori 

EPIYA-ABD 

(23) 

DSM533 

E.  coli  Top  10 

EPIYA-ABD 

Amp 

This  study 

DSM547 

E.  coli  Top  10 

EPIYA-ABtD 

Amp 

This  study 

DSM616 

H.  pylori 

EPIYA-ABtD 

This  study 

DSM641 

E.  coli  Top  10 

EPIYA-ABtC 

Amp 

This  study 

DSM613 

II.  pylori 

EPIYA-ABtC  (WT) 

This  study 

DSM927 

H.  pylori 

EPIYA-ABtC  (WT) 

This  study 

Table  12:  Primer  Sequences 


Primer  Name 

7.13del  CagA-fp 
7.13del  CagA  M-rp 
7.13  del  CagA  M-fp 
7.13  del  CagA-rp 
SacBSCN-F2 
Grace  1 

7.13  del  EPIYA-5-fp 
7.13  del  EPIYAM-rp 
7.13  del  EPIYAM-fp 
7.13  del  EPIYA-3-rp 
K154-7IS-Mlrp 
K154-7IS-M2fp 
K154-7IS-Mlfp 
K154-7IS-M12rp 
K154-7IS-M12fp 
K154-7IS-M2rp 
7.13cagA-2742-fp 
7.13cagA-4561-rp 
K3-7IS-Mlrp 
K3-7IS-M2fp 
K3-7IS-Mlfp 
K3-7IS-  M2rp 
K3(A/T)-rp 
K3(A/T)-fp 


Primer  Sequence 

CGTCTTTAACACAAGCAACACG 

GATTTTTGGCCCGGGAGGCTCGAGCATTGTTTCTCCTTACTATACC 

GAA  AC  AAT  GCTCGAGCCTCCCGGGCC  AAAAATCTTAAAGGATT  AAGG 

GTTTATGCTCTCTTTATAACCCC 

C  G  A  AT  C  G  A  ATTC  AGG  A  AC 

GGTT  GCACGCATTTTCCC 

GTCT  GAT  AAGTTTG  AAAAC  AT  C 

GTCTATCCCCGGGAGGCTCGAGCCCATTACCGACTAGGGTTCC 
GT  AAT  GGGCTCGAGCCTCCCGGGGAT  AGAC  A  AGCTC  AAAGATT  C 
CCTT  GTTTTT  AGC  A  AGGGGT  GG 

GGCTTCTGCTTGAGATAACCCATTACCGACTAGGGTTCC 
GAT  AGAC  AAGCTC  AAAG  ATTCT  AC 
GGAACCCTAGTCGGTAATGGGTTATCTCAAGCAGAAGCC 
GTTTCAATTCTTGCTCCCTTGAAAGCCCTACCTTACTGAG 
GGGCTTTC  AAGGGAGC  AAGAATT  GAAAC 
GTAGAATCTTTGAGCTTGTCTATC 
GGAAGCAAAAGCTCAAGCTAACAGC 
TACAGGTCTCACACATCATATCTCC 

CGTTGTGGCTTCTGTTTTAGATAACCCATTACCGACTAGGGTTCC 

GGTCATTTTGGCAAACTAGAACAAAAGATAGACAAGCTCAAAGATTC 

GGAACCCTAGTCGGTAATGGGTTATCTAAAACAGAAGCCACAACG 

GAATCTTTGAGCTTGTCTATCTTTTGTTCTAGTTTGCCAAAATGACC 

CACCTTTTTAGCAACTTGAGTGTAAATGGGCTCTTCAGGGC 

GCCCT  GAAGAGCCC  ATTTAC  ACTC  AAGTT  GCT  AAAAAGGT  G 


Citation 

This  study 
This  study 
This  study 
This  study 
(9) 

This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
This  study 
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serum  and  10  pg/ml  vancomycin.  For  experiments,  overnight  liquid  cultures  of  II.  pylori 
were  subcultured  in  fresh  medium  to  an  optical  density  at  600  mn  of  0.05  and  were 
grown  for  either  12  or  18  hours  shaking  at  100  rpm  under  microaerobic  conditions  (10% 
C02,  5%  02,  and  85%  N2)  created  by  an  Anoxomat  evacuation/replacement  system 
(Spiral  Biotech,  Norwood,  MA). 

When  appropriate,  bacterial  cultures  were  supplemented  with  the  following 
antibiotics,  as  noted  in  Table  11:  ampicillin  (Amp)  (USB  Corporation,  Santa  Clara,  CA) 
at  100  ug/m I  and/or  kanamycin  (Kan)  (Gibco,  Carlsbad,  CA)  at  25  pg/ml.  Additionally, 
5%  sucrose  (Sigma,  St.  Louis,  MO)  was  added  to  horse  blood  agar  plates  when  needed. 

Creation  of  Isogenic  Strains 

All  strains  were  created  in  the  wild  type  strain  7.13  background  (14,  15).  The 
A cagA,  AEPIYA,  and  all  of  the  isogenic  strains  were  created  using  splicing  by  overlap 
extension  (SOE)  PCR,  using  the  primers  listed  in  Table  12  and  Fig.  14.  The  basic 
strategy  used  is  depicted  in  Fig.  13.  Briefly,  the  EPIYA  region  of  strain  7.13  was 
replaced  with  the  counter-selectable  kan -sacB  cassette,  and  then  different  EPIYA  motifs 
of  interest  were  used  to  replace  the  kan -sacB  cassette;  thus,  yielding  the  different  isogenic 
strains. 

AEPIYA  Strain 

The  first  strain  that  was  created  was  the  AEPIYA  strain,  which  then  served  as  the 
parental  strain  for  construction  of  all  of  the  isogenic  strains.  To  create  this  strain,  a 
AEPIYA  construct  was  created  that  contained  a  fusion  product  consisting  of  the  region 
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Figure  13:  Creation  of  isogenic  strains.  A  schematic  showing  the  general  strategy  used 
to  create  the  isogenic  strains  is  shown.  The  wild  type  (7.13)  EPIYA  region  was  replaced 
with  the  counter  selectable  kan -sacB  cassette,  yielding  DSM577,  which  is  KanR  and 
Sucroses.  DSM577  (A  EPIYA)  was  then  used  as  the  parental  strain  background  for  all 
isogenic  strain  construction.  The  different  EPIYA  regions  were  engineered  to  be  flanked 
by  the  wild  type  upstream  and  downstream  cagA  regions,  and  to  replace  the  kan-.sm  /i 
cassette  via  double  homologous  recombination. 


Figure  13:  Creation  of  isogenic  strains 
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upstream  (5’  to  the  region  to  be  replaced)  of  the  EPIYA  motifs  and  the  region 
downstream  (3’  of  the  region  to  be  replaced)  of  the  EPIYA  motifs  (Fig.  14A).  Next,  this 
construct  was  digested  and  ligated  with  the  kan -sacB  cassette  to  yield  a  construct  that 
contained  the  upstream  region  of  7. 13  cagA  -  the  kan -sacB  cassette  -  the  downstream 
region  of  7.13  cagA.  The  resulting  construct  was  then  transformed  into  H.  pylori,  and 
recombinants  were  selected  as  described  below. 

The  first  step  in  construction  of  the  AEPIYA  construct  was  accomplished  by 
amplifying  the  upstream  region  of  7.13  cagA  with  the  primers  7.13  del  EPIYA-5-fp  and 
7.13  del  EPIYA  M-rp,  the  later  of  which  was  engineered  to  contain  an  Xhol  restriction 
site  (Fig.  14A).  The  downstream  region  was  amplified  using  primer  7.13  del  EPIYA  M- 
fp,  which  was  engineered  to  contain  a  Smal  restriction  site,  and  primer  7.13  del  EPIYA- 
3-rp  (Fig.  14A).  In  order  to  amplify  a  single  fused  product,  these  upstream  and 
downstream  PCR  products  were  gel  purified  using  the  QIAquick  gel  extraction  kit 
(Qiagen,  Germantown,  MD),  and  the  purified  products  were  combined  in  a  SOE  reaction 
using  the  7.13  del  EPIYA-5-fp  and  7.13  del  EPIYA-3-rp  primers  (Fig.  14A).  The 
amplified  fusion  product  was  cloned  into  pGEM-T  Easy  (Promega,  Madison,  WI),  and 
the  resulting  strain  was  named  DSM530.  Orientation  of  the  construct  was  confirmed  by 
EcoRI  (New  England  BioLabs,  Inc.,  Ipswich,  MA),  Xhol  (Invitrogen,  Carlsbad,  CA),  and 
Smal  (New  England  BioLabs,  Inc.,  Ipswich,  MA)  restriction  digestion  and  the  fusion  was 
also  sequenced  using  the  T7  and  SP6  primers.  Next,  a  construct  containing  the  kan -sacB 
cassette  was  created.  The  kan -sacB  cassette  was  purified  from  pDSM3  by  digestion  with 
Xhol  and  Smal  (10).  pDSM530  was  then  similarly  digested  with  Xhol  and  Smal  and 
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Figure  14:  Creation  of  the  various  EPIYA  motifs.  A  schematic  showing  the  strategy 
used  to  create  the  different  EPIYA  regions  is  shown.  A.)  The  basic  strategy  used  to 
create  the  homologous  regions  for  the  AEPIYA  region  replacement  is  shown  with 
primers  listed.  The  upstream  region  of  homology,  which  is  5’  of  the  region  to  be 
replaced  is  indicated  by  an  5’,  and  the  downstream  region  of  homology,  which  is  3’  of  the 
region  to  be  replaced  is  indicated  by  a  3’.  This  construct  was  digested  and  ligated  to  the 
kan -sacB  cassette  to  create  the  AEPIYA  strain,  which  has  the  EPIYA  region  replaced  by 
the  kan -sacB  cassette  (Fig.  1).  For  Western  strains  (EPIYA-AB1,  -AB'C,  -AB'CC,  - 
ABlCCC,  and  -AB'CCCC),  the  upstream  region  (5’)  of  homology  was  amplified  with 
primers  7.13  del  EPIYA-5-fp  and  K154-7IS-Mlrp,  and  the  downstream  region  (3’)  of 
homology  was  amplified  with  primers  K154-7IS-M2fp  and  7.13  del  EPIYA-3-rp.  For 
East  Asian  strain  (EPIYA-AB  lD),  the  upstream  cagA  region  (5’)  of  homology  was 
amplified  with  primers  7.13  del  EPIYA-5-fp  and  K3-7IS-Mlrp,  and  the  downstream 
cagA  region  (3’)  of  homology  was  amplified  with  primers  K3-7IS-M2fp  and  7.13  del 
EPIYA-3-rp.  The  regions  of  homology  from  7.13  were  used  in  the  creation  of  all  the 
isogenic  strains,  except  for  the  A  cagA  and  restorant  strains.  B.)  A  schematic  of  how  the 
EPIYA-AB1  strain  was  created  with  the  different  PCR  primers  and  SOE  products  is 
shown.  The  regions  of  homology  are  from  7.13  and  are  depicted  by  the  light  blue  boxes 
(5’  and  3’).  C.)  A  schematic  of  how  the  EPIYA- ABlC,  -AB'CC,  -AB'CCC,  and  - 
AB'CCCC  strains  were  created  with  the  different  PCR  primers  and  SOE  products  is 
shown.  The  regions  of  homology  are  from  7.13  and  are  depicted  by  the  light  blue  boxes 
(5’  and  3’).  Figure  legend  is  continued  on  page  307. 
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Figure  14:  Creation  of  the  various  EPIYA  motifs 
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Figure  14b:  Creation  of  the  various  EPIYA  motifs.  D.)  A  schematic  of  how  the  EPIYA- 
ABlD  strain  was  created  with  the  different  PCR  primers  and  SOE  products  is  shown.  Site 
directed  mutagenesis  primers  are  designated.  Again,  the  regions  of  homology  are  from 
7.13  and  are  depicted  by  the  light  blue  boxes  (5’  and  3’).  E.)  A  schematic  of  how  the 
restorant  strain  was  created  and  the  PCR  primers  used  are  indicated.  F.)  The  basic 
strategy  used  to  create  the  homologous  regions  for  the  replacement  of  the  cagA  gene  by 
the  kan-sacB  cassette  is  depicted.  PCR  primers  and  the  SOE  product  are  shown.  This 
construct  was  digested  and  ligated  to  the  kan -sacB  cassette  and  used  to  create  the  A  cagA 
strain,  which  has  the  cagA  gene  replaced  by  the  kan -sacB  cassette. 
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Figure  14b:  Creation  of  the  various  EPIYA  motifs 

D  DSM590 


cagA 

A 

_ 

— ► 

K3-7IS-M1fp 

K3-7IS-M2rp 

A 

mn 

5’ 

: 

A 

B 

D 

m 


7.13  del  EPIYA -5-fp 


K3-7IS-M2rp 


5’ 

A 

B 

D 

3’ 

7.13  del  EPIYA-5-fp 


7.13  del  EPIYA-3-rp 


5’ 

A 

B 

D 

lJL- 

7.13  del  EPIYA-5-fp 


K3  (A/T)-rp 

K3  (A/T)-fp  7.13  del  EPIYA-3-rp 


5’  I  I  A  I  B' 


3’ 


5’ 


A 

B‘ 

3’ 

7.13  del  EPIYA-S-fp 


7.13  del  EPIYA-3-rp 


E  H.  pylori  7.13 


cagA 

EPIYA 

7.13  del  EPIYA-5-fp 

7.13  del  EPIYA-3-rp 

EPIYA 

F 


H.  pylori  7.13 


cagA  |  EPIYA 

7.13  del  CagA-fp  7.13  del  CagA  M-rp 


7.13  del  CagA-fp  7.13  del  CagA-rp 


7.1 3  del  CagA  M-fp  7.1 3  del  CagA-rp 


312 


ligated  to  the  purified  kan -sacB  cassette;  thus,  yielding  a  plasmid  that  contained  the  7.13 
cagA  upstream  region  -  the  kan  -sacB  cassette  -  the  7.13  cagA  downstream  region 
construct  (Fig.  13).  The  proper  size  of  this  construct  was  verified  by  EcoRI  and  Smal 
digestion.  The  strain  bearing  this  construct  was  named  DSM53 1 .  pDSM53 1  was  then 
transformed  into  //.  pylori  7.13,  where  it  integrated  into  the  chromosome  via  double 
homologous  recombination.  Transformants  were  selected  based  on  kanamycin 
resistance.  Due  to  the  difference  in  size  between  the  wild  type  EPIYA  motif  region  and 
the  EPIYA  region  replaced  with  the  kan -sacB  cassette,  integration  was  confirmed  via 
PCR  using  the  7.13  del  EPIYA-5-fp  and  7.13  del  EPIYA-3-rp  primers.  Proper 
integration  was  also  confirmed  by  successful  amplification  using  a  primer  that  lays  within 
the  kan-sacB  cassette  (SacBSCN-F2),  and  a  primer  in  the  glutamine  racemase  gene 
immediately  downstream  of  cagA.  The  resulting  AEPIYA  strain  was  named  DSM577. 
After  the  preliminary  in  vitro  and  in  vivo  characterization  of  the  isogenic  strains,  which 
indicated  that  there  were  second  site  mutations  within  these  strains,  a  new  AEPIYA  strain 
(DSM926)  was  created  using  identical  methods. 

Western  Strains 

The  Western  strains  (EPIYA-AB1,  -AB‘C,  -AB'CC,  -AB'CCC,  and  -AB'CCCC) 
were  created  by  amplification  of  the  EPIYA  repeat  region  from  the  Korean  clinical 
isolate  DSM59E  This  strain  was  isolated  from  a  55  year  old,  female  patient  with  gastritis 
and  contained  an  EPIYA-AB'CCCC  motif  (23).  The  1  designates  a  natural  change  of  the 
alanine  in  the  EPIYA-B  repeat  to  a  threonine  (EPIYT).  The  complete  EPIYA  sequence 
can  be  found  in  Gen  Bank  under  the  accession  number  FJ458142  (23). 
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EPIYA-AB 1  Strain 

The  EPIYA-AB1  strain  was  created  through  a  series  of  PCR  reactions.  The  basic 
construct  was  designed  to  contain  the  cagA  upstream  region  from  7.13  (for  homologous 
recombination)  -  the  EPIYA-AB1  region  from  DSM591  -  the  extreme  C-terminus  of  cagA 
from  DSM591  -  the  cagA  downstream  region  from  7.13  (for  homologous  recombination; 
Fig.  14).  The  first  PCR  reaction  amplified  the  7.13  upstream  cagA  region  using  H.  pylori 
strain  7.13  as  a  template  and  primers  7.13  del  EPIYA-5-fp  and  K154-7IS-Mlrp  (Fig. 
14A).  The  second  PCR  reaction  amplified  the  EPIYA-AB1  region  from  DSM591  using 
the  K154-7IS-Mlfp  and  K154-7IS-M12rp  primers  (Fig.  14B).  The  third  PCR  reaction 
used  DSM591  as  the  template  and  primers  K154-7IS-M12fp  and  K154-7IS-M2rp  to 
amplify  the  sequence  from  the  end  of  the  last  EPIYA-C  motif  to  the  end  of  the  replaced 
EPIYA  region  (Fig.  14B).  The  fourth  PCR  reaction  amplified  the  downstream  cagA 
region  from  7.13  using  primers  K154-7IS-M2fp  and  7.13  del  EPIYA-3-rp  (Fig.  14B). 
These  purified  amplicons  were  then  used  in  three  different  SOE  reactions.  The  first  SOE 
reaction  fused  the  7.13  upstream  region  and  the  EPIYA-AB1  region  from  DSM591 
(products  from  the  first  and  second  PCR  reactions)  and  used  primers  7.13  del  EPIYA-5- 
fp  and  K154-7IS-M12rp  (Fig.  14B).  The  second  SOE  reaction  fused  the  extreme  C- 
tenninus  of  CagA  to  the  7.13  downstream  cagA  region  (products  from  the  third  and 
fourth  PCR  reactions)  and  utilized  primers  K154-7IS-M12fp  and  7.13  del  EPIYA-3-rp 
(Fig.  14B).  The  final  SOE  reaction  fused  the  products  from  the  first  two  SOE  reactions 
and  amplified  the  full  construct:  7.13  upstream  cagA  region  -  the  EPIYA-AB1  region 
from  DSM591  -  the  extreme  cagA  C-Terminus  -  the  7.13  downstream  cagA  region.  This 
was  accomplished  with  primers  7.13  del  EPIYA-5-fp  and  7.13  del  EPIYA-3-rp  (Fig. 
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14B).  This  full  length  product  was  next  cloned  into  pGEM-T  Easy,  and  proper  orientation 
was  verified  by  EcoRI  and  Smal  digestion.  The  construct  was  also  sequenced  with  T7 
(Promega,  Madison,  WI),  SP6  (Promega,  Madison,  WI),  K154-7IS-Mlfp  and  K154-7IS- 
M2rp  primers,  and  the  resulting  strain  was  named  DSM532.  DSM577  (7.13  AEPIYA) 
was  next  transformed  with  pDSM532,  and  the  resulting  double  crossover  event  resulted 
in  replacement  of  the  kan -sacB  cassette  with  the  EPIYA-AB1  construct,  making  the  strain 
now  SucR  and  Kans.  Transformants  were  selected  for  on  5%  sucrose  HBA  plates.  Since 
sucrose  resistance  can  easily  arise  via  spontaneous  mutation,  the  transformants  were 
further  screened  for  sensitivity  to  kanamycin,  and  screened  using  the  K154-7IS-Mlfp  and 
K154-7IS-M2rp  primers.  Next,  an  internal  portion  of  cagA  encompassing  the  EPIYA 
region  was  amplified  using  primers  7. 13cagA-2742-fp  and  7. 13cagA-4561-rp,  and 
sequenced  with  primers  7.13  del  EPIYA-5-fp,  7.13  del  EPIYA-3-rp,  K154-7IS-Mlfp, 
and  K154-7IS-M2rp  to  ensure  accurate  sequencing  of  the  replaced  EPIYA  region.  The 
resulting  strain  was  named  DSM601. 

EPIYA-AB1 C,  -AB'CC,  -AB'CCC,  and  -AB'CCCC  Strains 

The  EPIYA-AB'C,  -ABlCC,  -ABlCCC,  and  -ABlCCCC  strains  were  all  created 
using  the  same  strategy,  through  a  series  of  five  PCR  reactions  (Fig.  14).  The  overall 
construct  was  the  upstream  cagA  region  from  7.13  (for  homologous  recombination)  -  the 
different  EPIYA  region  combinations  from  DSM591  -  the  downstream  cagA  region  from 
7.13  (for  homologous  recombination).  The  first  PCR  reaction  used  H.  pylori  strain  7.13 
as  the  template  and  primers  7.13  del  EPIYA-5-fp  and  K154-7IS-Mlrp  (Fig.  14A)  to 
produce  the  upstream  region  from  7.13.  The  second  PCR  reaction  amplified  the  EPIYA 
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motifs  of  interest  from  DSM591  with  primers  K154-7IS-Mlfp  and  K154-7IS-M2rp  (Fig. 
14C).  Due  to  the  fact  that  the  EPIYA  region  of  DSM591  contains  the  EPIYA- AB'CCCC 
motif  and  the  fact  that  the  EPIYA-C  motifs  are  repeats,  amplification  of  this  region 
produced  a  ladder  of  PCR  products  corresponding  to  the  different  number  of  EPIYA-C 
repeats,  from  one  to  four  repeats  (Fig.  14C).  The  third  PCR  reaction  used  H.  pylori  strain 
7.13  as  the  template  and  primers  K154-7IS-M2fp  and  7.13  del  EPIYA-3-rp  (Fig.  14A)  to 
produce  the  downstream  cagA  region  from  7.13.  These  individual  products  were  purified 
and  used  in  two  SOE  reactions.  The  first  SOE  reaction  created  a  single  product  that 
contained  a  single  EPIYA  motif  region  (-ABlC,  -ABlCC,  -ABlCCC,  and  -AB'CCCC) 
from  DSM591  and  the  7.13  downstream  cagA  region  (products  from  the  second  and  third 
PCR  reactions)  with  primers  K154-7IS-Mlfp  and  7.13  del  EPIYA-3-rp  (Fig.  14C).  This 
step  was  repeated  with  each  of  the  four  different  EPIYA  motifs  indicated  above.  The 
second  SOE  reaction  used  primers  7.13  del  EPIYA-5-fp  and  7.13  del  EPIYA-3-rp  and 
created  the  fused  7.13  upstream  cagA  region  -  EPIYA  motif  region  from  DSM591  -  7.13 
downstream  cagA  region  constructs  (Fig.  14C). 

Each  of  these  constructs  was  purified  and  subcloned  into  pGEM-T  Easy.  Proper 
orientation  was  verified  by  EcoRI  and  Smal  digestion,  and  the  constructs  were  sequenced 
with  T7,  SP6,  K154-7IS-Mlfp  and  K154-7IS-M2rp  primers  to  ensure  accurate 
sequencing  of  the  entire  EPIYA  region.  This  yielded  the  following  strains:  DSM570 
(EPIYA- AB‘C),  DSM571  (EPIYA- ABlCC),  DSM572  (EPIYA- AB'CCC),  and  DSM573 
(EPIYA- AB'CCCC). 

These  plasmids  were  individually  transformed  into  DSM577  (7.13  AEPIYA 
strain),  and  the  resulting  double  crossover  event  resulted  in  replacement  of  the  ksca-sacB 
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cassette  with  the  different  Western  EPIYA  constructs  making  the  strains  now  SucR  and 
Kans.  Transformants  were  selected  on  5%  sucrose  HBA  plates,  but  since  sucrose 
resistance  can  arise  via  spontaneous  mutation,  the  transformants  were  further  screened  for 
sensitivity  to  kanamycin,  and  screened  using  the  K154-7IS-Mlfp  and  K154-7IS-M2rp 
primers.  Next,  an  internal  portion  of  cagA  that  encompasses  the  EPIYA  region  was 
amplified  using  primers  7.13cagA-2742-fp  and  7.13cagA-4561-rp,  and  sequenced  with 
primers  7.13  del  EPIYA-5-fp,  7.13  del  EPIYA-3-rp,  K154-7IS-Mlfp,  and  K154-7IS- 
M2rp  to  ensure  accurate  sequencing  of  the  replaced  EPIYA  region.  The  resulting  II. 
pylori  strains  were  named  the  following:  DSM602  (EPIYA-AEfC),  DSM605  (EPIYA- 
AB'CC),  DSM606  (EPIYA- AB'CCC),  and  DSM609  (EPIYA-AB'CCCC). 


East  Asian  Strains 

The  East  Asian  strain  was  created  by  modification  of  the  EPIYA  repeat  region  of 
the  Korean  isolate  DSM590.  This  strain  was  isolated  from  a  65  year  old,  male  gastric 
cancer  patient  and  contained  an  EPIYA-ABD  motif  (23).  The  complete  EPIYA  sequence 
can  be  found  in  Gen  Bank  under  the  accession  number  FJ4581 18  (23).  Of  note,  the 
EPIYA-B  motif  contained  the  nonnal  alanine  (as  compared  to  the  created  Western 
strains),  so  site  directed  mutagenesis  was  used  to  substitute  the  alanine  for  a  threonine  so 
as  to  provide  consistency  between  the  Western  and  East  Asian  strains. 

EPIYA-ABD  Strain 

Construction  of  an  EPIYA-ABD  construct  was  made  through  a  series  of  five  PCR 
reactions  (Fig.  14D).  This  construct  contained  the  7.13  upstream  cagA  region  -  the 
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EPIYA-ABD  region  from  strain  DSM590-  the  7.13  downstream  cagA  region.  The  first 
PCR  reaction  amplified  the  upstream  region  of  7. 13  with  primers  7. 13delEPIYA-5’-fp 
and  K3-7IS-Mlrp  (Fig.  14A).  The  second  PCR  reaction  amplified  the  EPIYA-ABD 
region  from  DSM590  using  the  K3-7IS-Mlfp  and  K3-7IS-  M2rp  primers  (Fig.  14D). 

The  third  PCR  reaction  used  7. 13  as  a  template  and  the  K3-7IS-M2fp  and  7. 13delEPIYA- 
3'-rp  primers  in  order  to  amplify  the  downstream  cagA  region  of  7.13  (Fig.  14A).  The 
purified  amplicons  were  then  used  in  three  different  SOE  reactions.  The  first  SOE 
reaction  fused  the  7.13  upstream  cagA  region  and  the  EPIYA-ABD  region  from  strain 
DSM590  (the  products  from  the  first  and  second  PCR  reaction)  with  primers 
7.13delEPIYA-5’-fp  and  K3-7IS-M2rp  (Fig.  14D).  These  products  were  then  used  in  the 
final  SOE  PCR  reaction  to  create  the  final  construct  (7.13  upstream  cagA  region  - 
DSM590  EPIYA-ABD  region  -  7.13  downstream  cagA  region)  using  primers 
7.13delEPIYA-5’-fp  and  7.13delEPIYA-3’-rp  (Fig.  14D).  This  construct  was  then  cloned 
into  pGEM-T  Easy,  and  the  insertion  was  verified  by  EcoRI  and  Smal  digestion. 
Furthermore,  it  was  also  sequenced  using  the  T7  and  SP6  primers,  and  the  resulting  strain 
was  named  DSM533. 

EPIYA-AB'D  Strain 

In  order  to  create  the  EPIY A-ABD  strain,  site-directed  mutagenesis  was 
accomplished  using  pDSM533  as  the  template.  The  substitution  of  the  threonine  for  the 
alanine  in  the  EPIYA-B  motif  was  completed  in  three  PCR  reactions  (Fig.  14D).  Two 
different  PCR  reactions  were  performed  using  pDSM533  as  the  template  and  using 
primers  7.13  del  EPIYA-5-fp  and  K3(A/T)-rp  and  K3(A/T)-fp  and  7.13  del  EPIYA-3-rp, 
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respectively  (Fig.  14D).  The  K3(A/T)-rp  and  K3(A/T)-fp  primers  were  designed  to 
overlap  and  replace  the  alanine  with  the  wanted  threonine.  The  products  of  these  PCR 
reactions  were  fused  to  create  the  EPIYA- ABlD  region  using  primers  7.13  del  EPIYA-5- 
fp  and  7.13  del  EPIYA-3-rp  (Fig.  14D).  This  construct  was  then  cloned  into  pGEM-T 
Easy  and  verified  via  sequencing  with  T7  and  SP6  primers.  The  resulting  strain  was 
named  DSM547.  DSM577  (7.13  AEPIYA)  was  next  transformed  with  pDSM547,  and 
the  resulting  double  crossover  event  resulted  in  replacement  of  the  kan -sacB  cassette  with 
the  EPIYA-ABlD  construct  making  the  strain  now  SucR  and  Kans.  Transfonnants  were 
selected  on  5%  sucrose  HBA  plates.  Again,  since  sucrose  resistance  can  arise  via 
spontaneous  mutation,  the  transformants  were  further  screened  for  sensitivity  to 
kanamycin,  and  screened  using  K3-7IS-Mlfp  and  K3-7IS-M2rp  primers.  Next,  an 
internal  portion  of  cagA  encompassing  the  EPIYA  region  was  amplified  using  primers 
7.13cagA-2742-fp  and  7.13cagA-4561-rp,  and  sequenced  using  primers  7.13  del  EPIYA- 
5-fp,  7.13  del  EPIYA-3-rp,  K3-7IS-Mlfp,  and  K3-7IS-M2rp  to  ensure  accurate 
sequencing  of  the  replaced  EPIYA  region.  The  resulting  strain  was  named  DSM616. 

Restorant  Strain 

As  an  important  control  for  possible  secondary  mutations  that  arose  during 
genetic  manipulation,  a  restorant  strain  was  also  created.  This  strain  contains  the  exact 
genomic  sequence  as  7. 13,  which  is  an  EPIYA-AB'C  strain.  The  restorant  was  created  by 
amplification  of  the  entire  EPIYA  region  of  7. 13  with  primers  7.13  del  CagA-fp  and  7.13 
del  CagA-rp  (Fig.  14E).  This  product  was  purified  and  cloned  into  pGEM-T  Easy. 

Proper  orientation  was  verified  by  EcoRI  and  Smal  digestion,  and  the  construct  was 
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sequenced  with  the  T7  and  SP6  primers.  The  resulting  strain  was  named  DSM641. 
DSM577  (7.13  AEPIYA)  was  next  transformed  with  pDSM641,  and  the  resulting  double 
crossover  event  resulted  in  replacement  of  the  kan -sacB  cassette  with  the  wild  type 
EPIYA  construct  making  the  strain  now  SucR  and  Kans.  Transfonnants  were  selected  on 
5%  sucrose  HBA  plates.  The  transformants  were  further  screened  for  sensitivity  to 
kanamycin,  and  screened  using  7.13  del  EPIYA-5-fp  and  7.13  del  EPIYA-3-rp  primers. 
Again,  due  to  a  size  difference  between  the  wild  type  EPIYA  region  and  the  EPIYA 
region  containing  the  kan -sacB  cassette,  amplification  using  these  primers  could  assess 
integration  of  the  new  EPIYA  motif  and  loss  of  the  kan -sacB  cassette.  Next,  an  internal 
portion  of  cagA  encompassing  the  EPIYA  region  was  amplified  using  primers  7.13cagA- 
2742-fp  and  7.13cagA-4561-rp,  and  sequenced  using  primers  7.13  del  EPIYA-5-fp  and 
7.13  del  EPIYA-3-rp  to  ensure  accurate  sequencing  of  the  replaced  EPIYA  region.  The 
resulting  strain  was  named  DSM613. 

After  the  preliminary  in  vitro  and  in  vivo  characterization  of  the  isogenic  strains, 
which  indicated  that  there  were  second  site  mutations  within  these  strains,  a  new  restorant 
strain  was  created  using  the  same  process  as  described  above.  However,  this  restorant 
strain  was  created  by  transfonning  DSM926  (7.13  AEPIYA)  withpDSM641.  The 
process  to  select  and  screen  transformants  was  identical  to  the  process  used  to  create 
DSM613.  The  resulting  strain  was  named  DSM927. 

AcagA  Strain 

Another  important  control  to  prove  the  role  of  cagA  in  any  observed  effects,  was  a 
AcagA  strain.  Again,  like  the  AEPIYA  strain,  the  creation  of  this  strain  was  a  multistep 
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process:  (1)  a  construct  consisting  of  the  fused  upstream  (5’)  region  and  downstream  (3’) 
region  of  cagA  was  created,  (2)  a  construct  containing  the  region  upstream  of  cagA  -  the 
kan -sacB  cassette  -  the  region  downstream  of  cagA  was  created,  and  (3)  this  construct 
was  integrated  into  H.  pylori  strain  7.13  via  double  homologous  recombination.  The 
region  upstream  of  cagA  was  amplified  with  primers  7. 13del  CagA-fp  and  7. 13del  CagA 
M-rp,  the  later  of  which  was  engineered  with  a  Xhol  restriction  site.  The  downstream 
region  of  cagA  was  amplified  using  the  7.13  del  CagA  M-fp  primer,  which  was 
engineered  with  a  Smal  restriction  site,  and  the  7.13  del  CagA-rp  primer  (Fig.  14F). 
Purified  PCR  products  from  these  reactions  were  then  combined  in  a  SOE  reaction  and 
amplified  using  the  7.13  del  CagA-fp  and  7.13  del  CagA-rp  primers  to  yield  a  fused 
product  of  the  regions  upstream  and  downstream  of  cagA  (Fig.  14F).  The  resulting  PCR 
product  was  cloned  into  pGEM-T  Easy  and  size  was  verified  by  EcoRI,  Xho,  and  Smal 
digestion.  This  construct  was  also  sequenced  with  the  T7  and  SP6  primers,  and  the 
resulting  strain  was  named  DSM598.  Next,  pDSM598  was  digested  with  Xhol  and  Smal 
and  ligated  to  the  purified  kan -sacB  cassette  obtained  by  similar  digestion  of  pDSM3. 

The  resulting  construct  contained  the  region  upstream  of  cagA-  the  kan -sacB  cassette-and 
the  region  downstream  of  cagA.  This  construct  was  cloned  into  pGEM-T  Easy,  and 
proper  orientation  was  confirmed  by  EcoRI  and  Smal  digestion.  The  resulting  strain  was 
named  DSM599.  pDSM599  was  then  transfonned  into  H.  pylori  7.13,  where  it 
integrated  into  the  chromosome  via  double  homologous  recombination.  Transformants 
were  selected  based  on  kanamycin  resistance.  Due  to  the  difference  in  size  between  the 
wild  type  cagA  and  the  cagA  replaced  with  the  kan -sacB  cassette,  integration  was 
confirmed  via  PCR  using  the  7.13  del  CagA-fp  and  7.13  del  CagA-rp  primers.  Proper 
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integration  was  also  confirmed  by  successful  amplification  using  a  primer  that  lays  within 
the  kan-sacB  cassette  (SacBSCN-F2),  and  a  primer  in  the  glutamine  racemase  gene 
immediately  downstream  of  cagA.  The  resulting  AEPIYA  strain  was  named  DSM600. 

The  A  cagA,  AEPIYA,  EPIYA-AB1,  -ABlC,  -ABlCC,  -ABlCCC,  -ABlCCCC,  - 
ABlD,  and  restorant  strains  were  successfully  created  and  verified  by  sequencing. 

Growth  Dynamics 

To  confirm  that  each  of  the  isogenic  strains  behaved  pheno typically  like  the  wild 
type  7.13  strain,  growth  kinetics  were  monitored  for  each  strain.  Bacterial  strains  were 
grown  and  expanded  on  HBA  plates  for  approximately  20  hours  and  used  to  inoculate  25 
mL  liquid  cultures,  which  were  subsequently  grown  for  approximately  18  hours.  These 
starter  cultures  were  then  used  to  inoculate  a  75  mL  liquid  culture  of  each  strain  at  a 
starting  optical  density  at  600  mn  of  0.05.  The  cultures  were  grown  microaeorobically 
with  shaking  at  100  rpm  at  37°C.  For  each  strain,  a  3  mL  aliquot  was  taken  at  time=0,  4, 
8,  12,  20,  28,  and  36  hours  to  assess  optical  density  as  well  as  colony  forming  units 
(CFU)  per  ml  of  culture,  which  was  determined  by  serial  dilution  and  plating  on  HBA 
plates.  Two  biologically  independent  replicates  of  this  experiment  were  performed. 

As  shown  in  Fig.  15,  each  of  the  strains  showed  a  pattern  of  growth  that  virtually 
mirrored  that  of  the  wild  type  H.  pylori  7.13  strain.  These  data  suggested  that  genetic 
manipulation  of  the  strains  did  not  result  in  any  overt  secondary  mutations  that  slowed 
the  growth  rate  of  any  of  the  isogenic  strains. 
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Figure  15:  Growth  dynamics  of  the  isogenic  strains.  Samples  were  taken  from  OD 
controlled  liquid  cultures  across  various  time  points,  and  serial  dilutions  were  plated  to 
obtain  single  colonies.  This  figure  is  representative  of  two  biological  repeats. 
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Figure  15:  Growth  dynamics  of  the  isogenic  strains 
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Expression  of  CagA 

After  growth  dynamics  of  each  strain  were  assessed,  CagA  expression  was 
verified  for  each  of  the  isogenic  strains.  Expression  was  assessed  through  Western  blot 
analysis  using  bacteria  grown  on  plates  and  in  liquid  culture.  First,  18-20  hour  lawns  of 
each  strain  were  harvested  from  HBA  plates,  pelleted,  resuspended  in  1  X  phosphate- 
buffered  saline  (PBS;  EMD  Chemicals,  Inc.,  Gibbstown,  NJ),  and  then  mixed  with  5  X 
Laemmli  sample  buffer  for  qualitative  Western  blot  analysis.  For  quantitative  CagA 
protein  analysis,  an  additional  aliquot  of  the  18-20  hour  lawn  grown  bacterial  lysates  was 
lysed  with  300  pL  of  lysis  buffer  [150  mM  NaCl,  50  mM  Tris-HCl,  pH  8.0,  5  mM 
EDTA,  1%  sodium  dodecyl  sulfate  (SDS),  and  10%  glycerol,  containing  one  Complete 
Mini  Protease  Inhibitor  Cocktail  Tablet  (Roche  Diagnostic,  Indianapolis,  IN)  and  100  pL 
10  mM  sodium  orthovanadate  (added  at  the  time  of  use)  per  10  mL  of  lysis  buffer]. 
Lysates  were  sonicated  and  then  centrifuged  to  remove  unlysed  cellular  debris.  The 
amount  of  protein  in  each  lysate  was  subsequently  quantified  using  the  BCA  Protein 
Assay  Kit  (Thermo  Scientific/Pierce,  Rockford,  IL). 

Two  different  experiments  were  performed  to  assess  CagA  expression  in  liquid 
culture  across  a  time  course.  The  first  experiment  used  aliquots  taken  from  one  large 
liquid  culture  at  various  time  points,  while  the  second  experiment  involved  inoculation  of 
one  large  liquid  culture,  which  was  then  divided  into  smaller  cultures.  These  smaller 
cultures  were  analyzed  due  to  growth  differences  observed  with  various  sizes  of  liquid 
culture.  For  both  of  these  experiments,  bacteria  were  grown  and  expanded  on  HBA  pates 
for  20-24  hours.  The  bacteria  were  then  used  to  inoculate  starter  liquid  cultures,  which 
were  grown  for  approximately  18  hours.  These  starter  cultures  were  used  to  inoculate  an 
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OD  controlled  liquid  culture  of  each  strain  at  a  starting  optical  density  at  600  nm  of  0.05. 
In  the  first  experiment,  aliquots  from  a  120  mL  OD  controlled  liquid  culture  were  taken 
at  time=0,  9,  18,  27,  and  40  hours  for  Western  blot  analysis.  These  bacterial  lysates  were 
lysed  with  300  pL  of  the  lysis  buffer  mentioned  above,  sonicated  and  then  centrifuged  to 
remove  unlysed  cellular  debris.  In  experiment  two,  starter  cultures  were  used  to 
inoculate  a  40  mL  liquid  culture  of  each  strain  at  a  starting  optical  density  at  600  nm  of 
0.05.  This  OD  controlled  liquid  culture  was  then  divided  into  five  7  mL  cultures.  One 
culture  from  each  strain  was  collected  at  time=0,  9,  18,  27,  and  40  hours  for  Western  blot 
analysis.  These  bacterial  lysates  were  lysed  with  300  pL  of  the  lysis  buffer  mentioned 
above,  sonicated  and  then  centrifuged  to  remove  unlysed  cellular  debris.  Subsequent 
protein  quantification  was  perfonned  using  the  BCA  Protein  Assay  Kit. 

For  quantitative  Western  blot  analysis,  equal  amounts  of  protein  plus  6  pL  of  5  X 
Laemmli  sample  buffer  were  added  to  each  well.  Bacterial  lysates  were  separated  by 
sodium  dodecyl  sulfate-  polyacrylamide  gel  electrophoresis  using  a  6%  separating  gel 
and  a  4%  stacking  gel.  Proteins  were  then  transferred  to  nitrocellulose  membranes 
(Thermo  Scientific,  Rockford,  IL)  using  a  semidry  transfer  apparatus  (Owl;  Thermo 
Scientific,  Rochester,  NY)  at  300  inAmp  for  45  minutes. 

Membranes  were  probed  with  a  1 :5,000  dilution  of  rabbit  IgG  anti-CagA 
polyclonal  antibody  b-300  (Santa  Cruz  Biotechnology,  Santa  Cruz,  CA),  followed  by  a 
1:20,000  dilution  of  HRP-conjugated  bovine  anti-rabbit  IgG  secondary  antibody  (Santa 
Cruz  Biotechnology,  Santa  Cruz,  CA).  Proteins  were  detected  using  the  SuperSignal 
West  Pico  chemiluminescent  substrate  kit  (Thermo  Scientific/Pierce,  Rockford,  IL)  and  a 
LAS-3000  Intelligent  Dark  Box  with  LAS-3000  Lite  capture  software  (Fujifilm, 
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Stamford,  CT).  Densitometry  was  performed  using  MultiGauge  software  (Fujifilm, 
Stamford,  CT). 

For  this  series  of  experiments,  the  qualitative  Western  blots  of  plate  grown 
bacteria  demonstrated  that  each  of  the  isogenic  strains  (except  the  A cagA  and  AEPIYA 
strains)  expressed  CagA.  Moreover,  the  size  of  the  resulting  protein  increased  with  an 
increasing  number  of  EPIYA  motifs  (Fig.  16A).  Four  biological  repeats  of  quantitative 
Western  blots  on  the  plate  grown  bacteria  suggested  that  each  of  the  isogenic  strains 
expressed  less  CagA  than  the  wild  type  strain.  The  biggest  difference  was  seen  with  the 
EPIYA-ABlCCCC  strain,  which  expressed  exactly  half  of  the  amount  of  CagA  as  the 
wild  type  strain  (Fig.  16B).  Of  note,  the  restorant  strain,  which  should  be  genetically 
identical  to  the  wild  type  strain,  also  expressed  slightly  lower  levels  of  CagA. 

Analysis  of  the  bacterial  lysates  harvested  from  liquid  grown  cultures  was  more 
problematic.  Samples  from  the  larger  liquid  cultures  showed  variable  amounts  of  full 
length  CagA  as  well  as  the  appearance  of  numberous  degradation  products.  Indeed, 
technical  repeats  showed  that  the  majority  of  CagA  was  found  in  the  degraded  CagA 
bands.  More  full-length  CagA  was  apparent  in  the  smaller  individually  grown  liquid 
cultures,  however  there  was  still  more  degraded  CagA  in  these  lysates  than  in  the  lawn 
grown  bacteria.  The  reason  for  this  difference  remains  unclear. 

Localization  of  Bacteria 

A  previous  study  suggested  that  CagA  affects  bacteria  localization  on  the  surface 
of  host  cells  (2).  Therefore,  we  next  wanted  to  examine  if  the  EPIYA  region  impacted 
where  the  bacteria  would  localize  on  the  host  cells.  Cover  slips  were  placed  in  six  well 
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Figure  16:  CagA  Expression.  A.  The  qualitative  Western  blot  on  plate  grown  bacteria 
shows  expression  of  CagA  for  each  isogenic  strain.  The  blot  illustrates  that  the  strains 
expected  to  express  CagA  do  so  at  the  expected  size,  which  varies  based  on  the  number 
of  EPIYA  repeats.  B.  This  histogram  shows  quantitative  CagA  expression.  Total  CagA 
expression  was  determined  by  densitometric  analysis  of  quantitative  Western  blots  of 
four  biological  repeats  of  lawn  grown  bacteria.  Numbers  are  presented  as  percent  of  wild 
type  and  the  column  represents  the  geometric  mean. 
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plates  and  left  under  a  UV  light  for  20  minutes.  One  mL  of  collagen  was  then  placed  in 
each  well  and  the  cover  slips  were  allowed  to  incubate  at  room  temperature  for  one  hour. 
Collagen  was  then  aspirated  and  re-frozen  for  future  use.  Wells  were  washed  three  times 
with  three  mL  of  1  X  PBS  and  then  allowed  to  air  dry.  The  plates  were  then  placed  under 
the  UV  light  for  an  additional  20  minutes  and  stored  at  4°C  for  less  than  one  week.  The 
resulting  collagen  coated  cover  slips  were  seeded  with  4  x  105  AGS  cells  and  allowed  to 
grow  to  confluency  for  approximately  four  days.  One  hour  before  the  infection,  cells 
were  washed  with  1  X  PBS  and  1  mL  of  fresh  wann  cell  culture  medium  was  added: 
Dulbecco’s  modified  Eagle’s  medium  without  L-glutamine  (Quality  Biological,  Inc., 
Gaithersburg,  MD),  supplemented  with  10%  fetal  bovine  serum,  10  pg/ml  vancomycin, 
and  2  nM  L-glutamine  (Quality  Biological,  Inc.).  Of  note,  warm  medium  is  needed  to 
prevent  cell  stress  and  subsequent  detachment.  Twelve  hour  OD  controlled  liquid 
cultures  were  then  used  to  infect  cells  at  a  multiplicity  of  infection  (MOI)  of  100. 
Infections  were  allowed  to  proceed  for  5  minutes  and  then  the  cells  were  washed  three 
times  with  warm  cell  culture  medium  and  allowed  to  incubate  for  10  minutes  before 
being  washed  with  1  X  PBS  and  fixed  with  2%  parafonnaldehyde  in  100  mM  phosphate 
buffer  (pH  7.4).  The  cells  on  the  cover  slips  were  then  penneabilized  and  blocked  in  a  1 
X  PBS  solution  containing  3%  BSA,  1%  saponin  (ACROS  Organics/Thermo  Scientific, 
New  Jersey),  and  0.05%  sodium  azide  (EM  Science,  Gibbstown,  NJ).  The  cover  slips 
were  subsequently  simultaneously  incubated  with  the  primary  antibodies,  mouse 
monoclonal  IgGl  anti-ZO-1  (Invitrogen)  and  rabbit  polyclonal  IgG  anti -H.  pylori 
(Thermo  Scientific),  which  were  diluted  to  a  final  concentration  of  1  ug/m L  and  1 .25 
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ug/mL,  respectively,  in  a  1  X  PBS  solution  containing  3%  BSA  and  0.05%  sodium  azide. 
The  cover  slips  were  next  washed  three  times  in  a  1  X  PBS  solution  containing  3%  BSA 
and  0.05%  sodium  azide  for  10  minutes  each  time.  The  secondary  antibodies,  Alexa 
Fluor  555-conjugated  goat  anti-mouse  IgGl  (Invitrogen),  Alexa  Fluor  488-conjugated 
goat  anti-rabbit  IgG  (Invitrogen),  and  DAPI  were  diluted  to  final  concentrations  of  2 
Ug/mL,  2.5  ug/mL,  and  10  ug/mL,  respectively,  in  a  1  X  PBS  solution  containing  3% 
BSA  and  0.05%  sodium  azide.  The  cover  slips  were  incubated  with  the  secondary 
antibodies  and  DAPI  for  45  minutes,  and  were  then  washed  three  times  in  a  1  X  PBS 
solution  containing  3%  BSA  and  0.05%  sodium  azide  for  10  minutes  each  time.  The 
cover  slips  were  next  rinsed  twice  with  1  X  PBS  and  once  with  sterile  double  distilled 
water,  removed,  and  placed  cell  side  down  on  a  drop  of  VectaShield  (Vector 
Laboratories,  Inc.,  Burlingame,  CA)  on  a  pre-cleaned  slide  and  were  sealed  on  alternating 
comers  with  nail  polish.  The  nucleus  (blue  -  DAPI),  cellular  junctions  (red  -  ZO-1),  and 
H.  pylori  (green)  were  visualized  by  collecting  z-stacks  with  a  Zeiss  LSM  710  confocal 
microscope  and  projecting  the  stacks  with  ZEN2009  software. 

Representative  images  from  the  wild  type  and  the  EPIYA-AB'CCCCstrains  are 
shown  in  Fig.  17,  and  represent  infected  cells  stained  for  the  nucleus  (blue  -  Dapi), 
cellular  junctions  (red  -  ZOl),  and  H.  pylori  (green).  No  major  differences  in 
localization  were  seen  among  the  various  strains,  including  the  A cagA  strain,  which  has 
been  shown  previously  to  localize  throughout  the  cell  surface  (2).  In  fact,  all  strains 
seemed  to  localize  to  the  cellular  tight  junctions,  regardless  of  cagA  status. 
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Figure  1 7:  Localization  of  bacteria  on  host  cells.  Confocal  images  of  infected  AGS 
cells  stained  for  the  nucleus  (blue  -  Dapi),  cellular  junctions  (red  -  ZOl),  and  H.  pylori 
(green)  are  shown.  The  image  on  the  left  shows  AGS  cells  infected  with  wild  type 
bacteria,  while  the  image  on  the  right  reveals  AGS  cells  infected  with  the  EPIYA- 
AB'CCCC  strain. 
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Figure  1 7:  Localization  of  bacteria  on  host  cells 
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Adherence  and  Internalization  Assays 

While  cagA  status  has  been  suggested  to  affect  localization,  no  difference  in  total 
adherence  has  been  observed  between  wild  type  and  A  cagA  strains  of  H.  pylori  in  vitro 
(2).  However,  if  there  were  significant  differences  in  bacterial  adherence  or 
internalization,  this  could  affect  how  much  CagA  could  be  translocated  into  host  cells. 
Thus,  the  adherence  and  internalization  assays  were  adapted  from  a  previous  study  (12). 
Both  the  adherence  and  internalization  assays  were  conducted  on  the  same  day  with  the 
same  cultures  in  order  to  detennine  percent  internalization.  In  short,  adherence  assays 
were  completed  by  seeding  24  well  plates  with  2.2  x  105  AGS  cells  for  21  hours.  One 
hour  before  the  infection,  cells  were  washed  with  1  X  PBS  and  1  mL  of  fresh  warm  cell 
culture  medium  as  described  in  the  localization  section  was  put  on  the  cells.  Again, 
wann  medium  was  needed  to  prevent  cell  stress  and  subsequent  detachment.  Next,  12 
hour  OD  controlled  liquid  cultures  were  used  to  infect  three  wells  per  strain  at  an  MOI  of 
10.  Infections  were  allowed  to  progress  for  30  minutes  and  wells  were  then  washed  three 
times  with  warm  cell  culture  medium.  Cells  were  next  lysed  with  1  mL  of  1%  saponin  in 
1  X  PBS.  These  lysates  were  then  serial  diluted  and  plated  in  triplicate  per  well  for  single 
colonies.  Numbers  between  replicate  plates  were  averaged,  and  then  the  number  of 
colonies  between  the  three  replicate  wells  were  averaged  to  yield  a  total  number  of 
adherent  bacteria.  This  number  was  then  divided  by  the  inoculum  to  determine  the 
percent  of  adherent  bacteria.  Four  biological  repeats  were  performed. 

For  internalization  assays,  24  well  plates  were  seeded  with  2.2  x  105  AGS  cells 
for  21  hours.  One  hour  before  the  infection,  cells  were  washed  with  1  X  PBS  and  1  mL 
of  fresh  warm  cell  culture  medium  was  put  on  the  cells.  The  same  12  hour  OD  controlled 
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liquid  cultures  used  in  the  adherence  assay  were  used  to  infect  three  wells  per  strain  at  an 
MOI  of  10.  Infections  were  allowed  to  progress  for  30  minutes  and  wells  were  then 
washed  three  times  with  warm  cell  culture  medium.  At  that  point,  1  mL  of  fresh  warm 
cell  culture  medium  supplemented  with  200  pg/mL  gentamicin  (Gibco,  Carlsbad,  CA) 
was  added  to  each  well  in  order  to  kill  all  external  bacteria.  Infected  cells  were  incubated 
for  2  hours  and  cells  were  washed  again  five  times  with  warm  cell  culture  medium.  Cells 
were  then  lysed  with  500  pL  of  1%  saponin.  100  pL  of  these  lysates  were  then  plated  on 
live  individual  plates  to  obtain  actual  number  of  internalized  bacteria  per  well.  Numbers 
between  the  replicate  wells  per  strain  were  then  averaged  and  divided  by  the  inoculum  to 
obtain  the  percent  of  internalized  bacteria.  Four  biological  repeats  were  performed. 

We  were  unable  to  demonstrate  the  high  levels  of  adherence  previously  observed 
in  other  studies;  however,  this  was  likely  because  our  bacterial  cultures  were  not  co¬ 
cultured  with  AGS  cells  (12).  Adherence  data  revealed  approximately  a  log  difference  in 
attachment  between  the  wild  type  and  the  A cagA  strains  (Fig.  18 A).  The  most  adherent 
strain  was  the  EPIYA-ABlD  strain,  which  showed  a  log  increase  in  the  amount  of 
adherent  bacteria  as  compared  to  the  wild  type  strain.  While  the  wild  type  and  the 
EPIYA-ABlC  strains  showed  similar  levels  of  adherent  bacteria,  there  was  an  inverse 
trend  between  the  number  of  EPIYA-C  motifs  and  number  of  adherent  bacteria.  The 
least  adherent  strain  was  the  EPIYA-ABlCCCC  strain  (Figure  17  and  18  A).  Of  note,  the 
restorant  strain,  which  should  behave  similar  to  the  wild  type  strain,  showed  more  than  a 
log  decrease  in  adherence  as  compared  to  the  wild  type  and  the  EPIYA-AB'C  strains. 

A  fraction  of  the  adherent  bacteria  actually  are  internalized  into  host  cells  (13).  In 
our  hands  the  percentage  of  adherent  bacteria  internalized  ranged  from  0.1%  to  1.4%. 
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Figure  18:  Adherence  and  Internalization.  A.  A  histogram  of  the  number  of  adherent 
bacteria  per  the  innoculum  for  each  of  the  isogenic  strains  is  shown.  Four  biological 
repeats  were  performed,  and  the  geometric  mean  is  designated  by  the  column.  B.  A 
histogram  of  the  number  of  internalized  bacteria  per  the  innoculum  for  each  isogenic 
strain  is  shown.  Four  biological  repeats  were  performed,  and  the  geometric  mean  is 
designated  by  the  column. 


339 


Again,  we  observed  little  difference  in  the  amount  of  wild  type  and  AcagT  bacteria  able 
to  be  internalized  into  host  cells  (Fig.  18B).  There  was  a  minimal  difference  in 
internalization  of  the  wild  type,  EPIYA-AB'C  and  the  restorant  strains.  Once  again  we 
observed  an  inverse  trend  with  increasing  number  of  EPIYA-C  motifs  and  the  number  of 
internalized  bacteria.  The  strain  with  the  lowest  percent  of  internalized  bacteria  was  the 
EPIYA-ABlCCCC  strain.  Conversely,  the  EPIYA-AB'D  strain  showed  the  most 
internalized  bacteria.  Surprisingly,  the  strain  with  the  highest  percentage  of 
internalization  was  the  restorant  strain,  while  the  strain  with  the  lowest  percentage  of 
internalization  was  the  EPIYA-AB'C  strain.  Taken  en  masse  the  fact  that  the  restorant, 
EPIYA-ABlC,  and  wild  type  strains  did  not  behave  the  same  in  the  adherence  and 
internalization  assays  suggested  that  there  may  be  unexpected  differences  in  the  strains. 

CagA  Phosphorylation  Assays 

In  order  for  CagA  to  be  biologically  active,  it  must  be  expressed  and  translocated 
into  host  cells  where  it  is  phosphorylated  by  host  cell  kinases  (27).  Therefore,  we  next 
assessed  the  ability  of  the  isogenic  strains  to  translocate  CagA  into  the  host  cells  as 
evidenced  by  the  appearance  of  the  phosphorylated  protein.  The  CagA  phosphorylation 
assays  were  conducted  essentially  as  previously  described  (7,  23).  Briefly,  six-well  tissue 
culture  plates  were  seeded  with  3.5  x  lCf  AGS  cells,  and  allowed  to  grow  in  normal  cell 
culture  medium  for  3  days.  Two  hours  prior  to  infection,  AGS  cells  were  washed  with  1 
X  PBS,  and  3  mL  of  fresh  cell  culture  medium  was  added  to  each  well.  18  hour  OD 
controlled  liquid  cultures  of  each  H.  pylori  strain  were  resuspended  in  1  mL  of  1  X  PBS, 


and  were  used  to  infect  AGS  cells  at  a  MOI  of  100. 
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Infected  cell  lysates  were  collected  from  replicate  wells  every  hour  for  the  first  8 
hours  and  again  16  and  24  hours  post  infection.  Infected  cells  were  lysed  with  250  pL  of 
the  lysis  buffer  described  in  the  CagA  expression  section.  Lysates  were  sonicated  and 
then  centrifuged  to  remove  unlysed  cellular  debris.  Subsequent  protein  quantification 
was  performed  using  the  BCA  Protein  Assay  Kit.  Equal  amounts  of  protein  and  6  pL  of 
5  X  Laemmli  sample  buffer  were  then  added  to  each  well.  Infected  cell  lysates  were 
separated  by  sodium  dodecyl  sulfate-polyacrylamide  gel  electrophoresis,  using  a  bilayer 
separating  gel  (6%  and  then  12%)  and  a  4%  stacking  gel,  and  proteins  were  transferred  to 
nitrocellulose  membranes  using  a  semidry  transfer  apparatus  at  300  mAmp  for  90 
minutes. 

Membranes  were  probed  with  a  1 :5,000  dilution  of  an  anti-phospho-tyrosine 
monoclonal  antibody,  pYlOO  (Cell  Signaling  Technology,  Danvers,  MA),  followed  by  a 
1:20,000  dilution  of  HRP-conjugated  goat  anti-mouse  IgG  (Santa  Cruz  Biotechnology, 
Santa  Cruz,  CA)  secondary  antibody  and  detection  was  perfonned  as  described  above. 
Membranes  were  subsequently  stripped  with  periodic  agitation  at  55°C  in  a  pre -wanned 
10-mM  dithiothreitol  solution  for  45  minutes.  Resulting  blots  were  then  well  rinsed  with 
running  deionized  water  to  remove  residual  DTT.  Membranes  were  probed  with  a 
1:5,000  dilution  of  rabbit  IgG  anti-CagA  polyclonal  antibody,  b-300  followed  by  a 
1:20,000  dilution  of  HRP-conjugated  bovine  anti-rabbit  IgG  secondary  antibody. 
Membranes  were  then  subsequently  re-stripped  and  reprobed  with  1 : 1,000  dilution  of 
goat  anti-GAPDH  IgG  (Santa  Cruz  Biotechnology,  Santa  Cruz,  CA)  plus  1%  BSA  (EMD 
Chemicals,  Inc.,  Gibbstown,  NJ),  followed  by  a  1:20,000  dilution  of  HRP-conjugated 
donkey  anti-goat  IgG  (Santa  Cruz  Biotechnology,  Santa  Cruz,  CA)  secondary  antibody 


341 


plus  1%  BSA.  Detection  was  conducted  at  each  step  as  described  above.  Densitometry 
was  performed  using  MultiGauge  software,  and  the  level  of  CagA  phosphorylation  was 
nonnalized  to  the  level  of  GAPDH. 

Since  the  primary  phosphorylation  sites  are  the  tyrosine  residues  found  within  the 
EPIYA-C  or  -D  motif,  it  was  not  surprising  that  the  EPIYA-AB1  strain  did  not  show  any 
considerable  accumulation  of  phosphorylated  CagA  at  any  time  point.  All  other  strains 
that  expressed  CagA  showed  a  detectable  amount  of  phosphorylated  CagA  by  2  hours 
post  infection.  The  phosphorylation  increased  from  2  to  4-6  hours  until  peak 
phosphorylation  was  observed  between  4  to  8  hours.  Phosphorylation  then  decreased 
between  8  and  16  hours,  and  by  24  hours  the  amount  of  phosphorylated  CagA  was 
negligible. 

When  a  single  time  point  of  5  hours  was  assessed  across  the  different  strains  we 
observed  some  expected  trends  (Fig.  19).  There  was  no  detectable  phosphorylated  CagA 
in  the  A cagA,  or  the  EPIYA-AB1  strains.  The  three  isogenic  strains  that  contained  only 
one  EPIYA  motif  (EPIYA-AB lC,  EPIYA-AB  lD,  and  the  restorant)  showed  similar  levels 
of  phosporylation.  However,  this  amount  of  phosphorylated  protein  was  almost  2.5  times 
less  then  the  level  of  phosphorylated  CagA  seen  in  the  wild  type  strain.  Increasing 
amounts  of  phosphorylated  CagA  were  detected  with  increasing  numbers  of  EPIYA-C 
motifs  (from  one  to  three),  and  the  greatest  amount  of  phosphorylated  CagA  was  seen 
with  the  EPIYA-AB lCCC  and  EPIYA-ABlCCCC  strains.  While  the  trends  for  the 
isogenic  strains  were  what  we  expected  based  on  the  number  of  EPIYA  motifs,  the  fact 
that  the  restorant,  EPIYA-ABlC,  and  EPIYA-ABlD  strains  showed  reduced  levels  of 


342 


Figure  19:  Phosphorylation  of  CagA.  A.  A  Western  blot  of  II.  pylori  infected  cell 
lysates  is  shown.  Lysates  were  collected  five  hours  post  infection  and  subjected  to 
Western  blot  analysis.  The  membrane  was  first  probed  with  an  anti-phosphotyrosine 
antibody  (top  panel)  and  then  stripped  and  reprobed  using  an  anti-GAPDH  antibody 
(bottom  panel)  in  order  to  nonnalize  for  equal  loading.  B.  The  histogram  represents 
densitometric  analysis  of  the  Western  blot  shown  above  where  the  data  were  nonnalized 
for  equal  loading  using  the  GAPDH  blot. 


Phosphorylation  Units 
5  hours  Post-Infection 
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phosphorylated  CagA  as  compared  to  the  wild  type  strain  indicated  once  again  that  the 
strains  may  not  be  isogenic  or  may  have  secondary  mutations. 

Morphological  Changes 

One  of  the  easiest  ways  to  grossly  assess  CagA-modulated  host  cell  signalling 
pathways  is  to  look  at  morphological  changes  induced  by  H.  pylori  in  infected  AGS  cells. 
In  these  cells,  CagA  delivery  results  in  the  stereotypical  “hummingbird”  phenotype, 
which  is  characterized  by  long  fingerlike  protrusions  from  the  cell  (27).  We  therefore 
wanted  to  assess  EPIYA-dependent  differences  in  modulation  of  host  signaling  pathways. 
Morphology  assays  were  adapted  from  a  method  described  previously  (7).  Briefly,  cover 
slips  were  placed  in  six  well  plates  and  collagen  coated  for  one  hour  as  described  above. 
AGS  cells  were  seeded  at  a  density  of  2.5  x  105  cells,  and  were  allowed  to  grow  in 
normal  cell  culture  medium  for  approximately  24  hours.  18  hour  OD  controlled  liquid 
cultures  were  used  to  infect  the  AGS  cells  at  a  MOI  of  100.  Infections  were  allowed  to 
proceed  for  3,  6,  or  9  hours,  and  then  samples  were  fixed  with  2%  parafonnaldehyde  in 
lOOnM  phosphate  buffer  (pH  7.4).  Wells  were  given  random  numbers  to  blind  the 
overall  results.  Images  were  captured  on  a  Zeiss  LSM  Pascal  confocal  microscope,  and 
50  cells  from  each  well  were  counted  and  the  measurements  for  length  and  breadth  of  the 
cells  were  obtained  (7).  Two  biological  repeats  were  perfonned. 

At  six  hours  post  infection,  which  was  the  time  frame  in  which  we  saw  maximal 
CagA  phosphorylation,  the  number  of  elongated  cells  in  the  uninfected  group  was  similar 
to  cells  infected  with  the  A cagA  strain.  Surprisingly,  the  restorant  and  EPIYA-ABt 
strains  showed  only  a  slight  increase  in  the  number  of  elongated  cells  as  compared  to  the 
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Figure  20:  Morphology  of  infected  host  cells.  A.  An  image  of  an  elongated  AGS  cell 
with  annotations  for  the  breadth  and  length  of  the  cells  is  shown.  These  depictions  of 
breadth  and  length  are  representative  of  the  areas  used  for  measurements  during  the 
morphology  experiments.  The  measurement  methodology  is  adapted  from  Bouracz,  et  al. 
(7).  B.  The  left  panel  is  a  confocal  image  of  AGS  cells  infected  with  the  wild  type  strain 
for  nine  hours,  and  the  right  panel  is  a  confocal  image  of  AGS  cells  infected  with  the 
A cagA  strain  for  nine  hours.  C.  The  top  histogram  depicts  the  percent  of  elongated  cells 
observed  six  hours  post  infection  with  the  different  isogenic  strains.  The  bottom 
histogram  depicts  the  percent  of  elongated  cells  recorded  nine  hours  post  infection  with 
the  different  isogenic  strains. 


Figure  20:  Morphology  of  infected  host  cells. 


Wild  type 


6  hours  Post-Infection 


9  hours  Post-Infection 


348 


A cagA  strain.  Conversely,  the  wild  type,  EPIYA-AB1,  -ABlCCCC,  and  -ABlD  strains  all 
induced  similar  levels  of  elongated  cells,  and  the  EPIYA-AB'CC  and  -AB'CCC  strains 
induced  an  increased  number  of  elongated  cells. 

Some  strains  showed  maximum  phosphorylation  at  eight  hours  post  infection. 
Thus,  in  order  to  assess  for  changes  in  those  strains,  we  also  looked  at  the 
morphological  changes  at  nine  hours  post  infection.  Some  trends  continued  at  9  hours 
post  infection;  cells  treated  with  brucella  broth  (uninfected  cells)  or  the  A  cagA  strain 
showed  a  m  irn  i trial  amount  of  cell  elongation.  While  the  number  of  elongated  cells 
induced  by  the  restorant  increased,  this  strain  still  induced  the  least  amount  of  elongated 
cells  of  all  the  strains  that  could  deliver  CagA.  At  this  time  point,  the  EPIYA-AB1,  - 
ABlC,  -AB'CC,  and  -ABlD  strains  all  induced  similar  levels  of  elongated  cells,  and  the 
wildtype  and  EPIYA-AB'CCCC  strains  induced  similar  and  increased  levels  of  elongated 
cells.  The  strain  which  induced  the  largest  percentage  of  elongated  cells  (-57%)  was  the 
EPIYA-  ABlCCC  strain  (Fig.  20).  Once  again,  the  fact  that  the  wild  type,  EPIYA-AB'C, 
and  restorant  strains  all  behaved  so  differently  suggested  that  there  is  a  second  site 
mutation  within  at  least  some  of  the  strains. 

In  vivo  analysis  of  isogenic  strains 

A  large  scale  animal  study  was  conducted  to  assess  differences  between  the 
isogenic  strains.  Infection  protocols  were  similar  to  those  previously  published  (16,  25). 
Due  to  the  large  number  of  animals  (N=715)  to  be  infected,  Mongolian  gerbils  were 
obtained  from  three  different  age  groups:  28-35  days,  35-42  days,  and  42-49  days.  The 
different  age  groups  were  evenly  divided  across  the  different  infection  groups  and  the 
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animals  were  age  matched  at  each  time  point.  Male  Mongolian  gerbils  (Charles  River 
Laboratories  International,  Inc.  Wilmington,  MA)  were  fasted  12  hours  prior  to  infection, 
and  infected  orogastrically  with  approximately  10s  H.  pylori  cells  for  each  of  the  different 
isogenic  strains.  Once  again,  due  to  the  large  number  of  animals  (N=715)  to  be  infected, 
the  infection  groups  were  divided  into  two  groups.  Group  1  consisted  of  animals  that 
were  mock  infected  (n=55),  or  were  infected  with  the  EPIYA-AB'C  strain  (n=l  10), 
EPIYA-ABlCCC  strain  (n=l  10),  or  the  EPIY  A-ABlD  strain  (n=l  10).  Group  2  consisted 
of  animals  infected  with  the  A cagA  strain  (n=l  10),  wild  type  7.13  strain  (n=l  10),  or  the 
EPIYA-AB1  strain  (n=l  10).  Animals  were  age  matched  and  assigned  random  numbers 
for  blinding  purposes.  A  variety  of  time  points  were  assessed  (2,  4,  6,  8,  10,  12,  16,  20, 
24,  30,  and  36  weeks)  and  10  animals  per  time  point  per  strain  were  sacrificed.  For  the 
mock  infected  animals,  5  animals  per  time  point  were  sacrificed.  At  the  indicated  times, 
the  stomachs  were  harvested,  and  the  glandular  portion  of  the  stomach  was  bisected. 

Half  of  the  tissue  was  paraffin  embedded,  sectioned  and  stained  with  hemotoxylin  and 
eosin.  The  other  half  of  the  stomach  was  weighed  and  homogenized  in  brucella  broth 
with  a  mechanical  homogenizer  (Tissue,  Tearer;  Biospec  Products  Inc.,  Bartlesville,  OK), 
and  the  number  of  viable  CFU  was  determined  by  plating  on  HBA  plates  supplemented 
with  50  pg/mL  vancomycin,  10  pg/mL  nalidixic  acid  (Sigma,  St.  Louis,  MO)  and  100 
pg/mL  bacitracin  (USB  Corporation,  Santa  Clara,  CA). 

The  most  striking  result  from  the  in  vivo  study  was  the  fact  the  EPIYA-AB^ 
strain  showed  a  dramatic  decrease  in  colonization  as  compared  to  the  other  isogenic 
strains  used  to  infect  the  Mongolian  gerbils  at  every  time  point  thoroughout  the 
experiment  (Fig.  21  A).  For  convenience,  colonization  was  defined  by  detection  of  a 
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Figure  21:  In  vivo  colonization.  A.  A  graph  of  the  percent  of  animals  colonized  with  the 
various  isogenic  strains  for  each  time  point  is  shown.  Colonization  was  defined  as  our 
ability  to  detect  at  least  one  bacterium.  The  level  of  detection  was  decreased  after  the  six 
week  post  infection  time  point.  B.  A  histogram  of  the  overall  percentage  of  animals 
colonized  per  the  various  isogenic  strains  is  shown. 
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Figure  21:  In  vivo  colonization 
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single  bacterium.  The  overall  colonization  rate  was  93%  or  above  for  animals  infected 
with  every  strain  (AcagA  99%,  wild  type  100%,  EPIYA-  ABl  98%,  EPIYA-  AB‘C  93%, 
and  EPIYA-  AB'CCC  100%),  except  for  the  EPIYA-  ABlD  strain  which  was  57%  (Fig 
2 IB).  However,  within  the  animals  infected  with  the  EPIYA-  ABlD  strain  there  were 
dramatic  differences  in  colonization  loads.  Of  the  animals  that  were  colonized  with  the 
EPIYA-  ABlD,  only  strain  67%  showed  colonization  levels  similar  to  wild  type. 

We  also  wanted  to  assess  induced  pathological  differences,  so  a  pathologist, 
blinded  to  the  study,  analyzed  the  sections  for  diagnosis.  Animals  infected  with  wild  type 
bacteria  progressed  to  cancer  within  six  weeks,  which  is  similar  to  what  has  been 
previously  published  (14,  15).  However,  at  the  six  week  time  point,  animals  infected 
with  the  isogenic  strains  only  developed  gastritis.  Of  note  these  included  a  number  of 
animals  infected  with  the  AcagA  strain  (Fig.  22A).  This  trend  extended  throughout  most 
of  the  experiment  (Fig.  22B).  However,  by  36  weeks,  most  animals  had  progressed  to 
gastric  cancer,  again  including  those  infected  with  the  AcagA  strain.  Additionally,  there 
was  a  high  percentage  of  animals  infected  with  the  ABlD  strain  that  displayed  normal 
gastric  histology,  even  at  36  weeks  post-infection.  However,  these  results  are  likely 
confounded  by  the  fact  that  these  animals  may  not  have  been  colonized.  In  fact,  some 
animals  that  had  no  detectable  levels  of  H.  pylori  infection  developed  gastric  disease. 

This  fact  suggests  that  there  may  have  been  external  factors  influencing  disease  within 
these  animals  or  that  the  strains  contained  secondary  mutations  that  affected  colonization 
and  disease  development. 
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Figure  22:  Isogenic  strain-induced  disease  states.  Histograms  of  the  percent  of  animals 
afflicted  with  each  disease  state  at  six  weeks  (A),  16  weeks  (B),  and  36  weeks  (C)  are 
shown.  Animals  with  normal  gastric  mucosa  are  depicted  by  the  blue  bars,  while  animals 
that  suffered  from  gastritis  are  depicted  by  maroon  colored  bars.  Animals  with  signs  of 
dysplasia  are  indicated  by  the  yellow  bars,  and  animals  that  were  diagnosed  with  gastric 
cancer  are  represented  by  turquoise  bars. 
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Figure  22:  Isogenic  strain-induced  disease  states 
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Due  to  the  potential  second  site  mutations,  a  small  animal  study  was  performed  to 
assess  pathologic  differences  between  the  wild  type  and  a  newly  created  restorant  strain 
(DSM927)  at  the  six  week  time  point.  Histological  sections  were  graded  for  acute 
gastritis,  chronic  gastritis,  and  hyperplasia,  using  a  scale  from  1  (mild)  to  3  (marked)  as 
previously  described  (8).  Animals  infected  with  both  wild  type  and  restorant  strains 
showed  infiltration  of  polymorphonuclear  neutrophils  in  the  stroma  (a  grade  of  two;  Fig. 
23).  Animals  infected  with  both  strains  also  showed  formation  of  lymphoid  follicles  in 
the  mucosa  and  submucosa  (a  grade  of  three;  Fig.  1 1).  Most  animals  infected  with  the 
wild  type  strain  showed  the  presence  of  hyperplastic  glands  in  the  mucosa  (score  of  one) 
with  a  few  animals  showing  heterotopic  proliferative  glands  adjacent  to  the  submucosa 
(score  of  two).  Similarly,  infection  with  the  restorant  strain  produced  hyperplastic  glands 
in  the  mucosa  and  heterotopic  proliferative  glands  adjacent  to  the  submucosa  (score  of 
two;  Fig.  23). 

Conclusions 

Herein  we  present  the  first  attempt  to  address  the  role  of  the  CagA  EPIYA  motifs 
in  H.  pylori- induced  disease  progression  by  the  creation  and  characterization  of  isogenic 
strains.  Since  it  was  shown  to  cause  cancer  in  the  Mongolian  gerbil,  we  used  the  H. 
pylori  7.13  strain  as  the  background  strain  and  constructed  A cagA,  AEPIYA,  EPIYA- 
ABl,  -ABlC,  -AB'CC,  -AB'CCC,  -ABlCCCC,  -ABlD,  and  restorant  isogenic  strains.  To 
this  end,  we  were  able  to  successfully  optimize  in  vitro  assays  to  assess  localization  of  the 
isogenic  strains  on  AGS  cells,  adherence  to  AGS  cells,  internalization  into  AGS  cells, 
translocation  and  phosphorylation  of  CagA,  as  well  as  morphological  changes  in  AGS 
cells,  which  is  a  surrogate  for  modulation  of  host  signaling  pathways.  Additionally,  we 
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Figure  23:  I  listopathology  of  the  new  restorant.  A  histogram  depicting  similarities  in 
the  level  of  disease  between  animals  infected  with  the  wild  type  strain  and  the  newly 
created  restorant  strain,  DSM927.  Histological  sections  were  graded  for  acute  gastritis, 
chronic  gastritis,  and  hyperplasia,  using  a  scale  from  1  (mild)  to  3  (marked;  8).  Acute 
gastritis  is  graded  based  on  the  infiltration  of  polymorphonuclear  neutrophils,  and  chronic 
gastritis  is  graded  based  on  the  number  and  location  of  lymphoid  follicles.  Hyperplasia  is 
measured  by  the  presence  and  location  of  hyperplastic  glands  and  heterotopic 
proliferative  glands. 
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Figure  23:  Histopathology  of  the  new  restorant 
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were  also  able  to  carry  out  a  long  term  large  scale  animal  study.  While  the  growth 
dynamics  of  all  of  these  strains  mimicked  the  wild  type  strain,  changes  in  the  expression 
of  CagA  were  evident  across  these  strains.  Since  theoretically,  all  of  the  isogenic  strains, 
with  the  exceptions  of  the  A cagA  and  AF.PIY  A  strains,  should  express  equal  levels  of 
CagA,  these  differences  in  CagA  expression  were  the  first  indication  that  the  strains 
might  contain  secondary  mutations.  Increasing  evidence  suggested  that  there  were 
secondary  mutations  within  the  strains,  since  the  wild  type,  the  EPIYA-AB'C,  and  the 
restorant  strains  did  not  act  similarly  in  the  in  vitro  assays.  Additionally,  there  were 
drastic  differences  in  the  pathology  induced  between  the  wild  type  strain  and  the  EPIYA- 
ABlC  strain  in  vivo. 

Since  beginning  these  studies,  we  have  learned  that  strain  7.13  loses  its  in  vivo 
virulence  after  lab  passage.  Also,  other  work  in  our  lab  has  shown  drastic  differences  in 
vivo  phenotype  for  strains  that  were  identically  manipulated  to  delete  the  same  gene. 
Thus,  7.13  is  likely  not  a  good  strain  to  complete  our  studies.  However,  to  date  each  of 
the  isogenic  strains  has  been  reconstructed  in  the  genetically  stable  strain  background  of 
strain  G27.  Thus,  the  in  vitro  characterization  of  the  EPIYA-motifs  may  still  be 
accomplished  using  the  techniques  outlined  in  this  chapter.  Also,  a  new  strain  has  been 
shown  to  cause  gastric  cancer  in  a  mouse  model.  Thus,  there  are  additional  options  to 
explore  the  role  of  the  EPIYA  motifs  in  H.  pylori- induced  disease  progression. 
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Chapter  Six 


Discussion 

Although  much  has  been  learned  about  specific  H.  pylori  virulence  factors,  little 
is  currently  understood  about  why  some  H.  pylori- infected  individuals  progress  to 
develop  gastric  cancer  while  others  remain  asymptomatic.  The  goal  of  this  thesis  was  to 
better  understand  the  association  between  different  polymorphisms  in  CagA  and  VacA 
and  disease  outcome.  Specifically,  we  showed  that  East  Asian  CagA  (EPIYA-ABD)  was 
linked  to  progression  to  gastric  cancer  in  a  South  Korean  population  (39).  In  fact,  all  H. 
pylori  strains  from  cancer  patients  expressed  and  delivered  phosphorylatable  CagA  to 
host  cells.  In  contrast,  the  presence  of  the  cagA  gene  did  not  strictly  correlate  to 
expression  and  delivery  of  CagA  with  non-cancer  strains  (39).  Our  study  was  the  first  to 
statistically  link  a  specific  cagA  allele  to  gastric  cancer  development  in  a  human 
population.  We  next  examined  the  role  of  VacA  polymorphisms  within  that  population, 
and  found  that  while  the  distribution  of  vacA  alleles  was  not  directly  associated  with 
disease  state,  it  was  associated  with  the  distribution  of  cagA  alleles.  Furthennore,  the 
vacA  allele  was  associated  with  the  cagA  allele  and  disease  state.  Next,  we  were  able  to 
analyze  the  contribution  of  the  newly  described  i  region  of  VacA  to  disease  development. 
To  this  end,  we  identified  an  amino  acid  (196)  that  was  associated  with  for  development 
of  gastric  cancer.  We  were  also  able  to  identify  some  associations  that  were  CagA- 
dependent,  such  as  the  association  of  VacA  and  disease  state  in  the  EPIYA-ABD 
population  as  well  as  amino  acid  distribution  at  position  23 1  and  disease  state  in  the  non 
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EPIYA-ABD  population.  In  addition,  in  the  process  of  completing  this  thesis,  we  were 
able  to  optimize  techniques  that  will  ultimately  be  used  to  characterize  CagA  isogenic 
strains.  Those  future  studies  will  help  to  elucidate  the  role  of  the  EPIYA  motifs  in  H. 
pylori- induced  host  cell  damage  both  in  vitro  and  in  vivo.  En  masse,  the  data  presented 
herein  add  to  what  we  know  about  the  complexity  of  H.  pylori- induced  pathogenesis. 
Overall,  it  is  becoming  increasingly  more  evident  that  polymorphisms  within  CagA  and 
VacA,  alone  and  in  concert,  affect//,  pylori- induced  disease.  However,  the  reason  why 
only  a  portion  of  the  population  develops  gastric  cancer  still  remains  unclear.  Other 
bacterial  virulence  factors,  as  well  as  multiple  host,  dietary,  and  environmental  factors 
have  been  indicated  as  participants  in  H.  pylori- induced  disease.  Clearly,  further  study  is 
required  to  determine  which  factors  are  involved  and  what  role  they  have  in  the 
development  of  H.  pylori- induced  gastric  cancer. 

Unanswered  Questions  Stemming  from  the  Epidemiological  Studies 


CagA 

A  key  question  that  should  be  addressed  is,  why  is  there  a  difference  in  the  degree 
of  CagA  variation  between  Western  and  East  Asian  strains.  Western  isolates  vary  widely 
in  the  number  of  EPIYA-C  motifs  that  are  present  (7,  8),  whereas  there  is  a  distinct  lack 
of  variation  in  East  Asian  strains.  In  fact,  one  study  examining  Gen  Bank  sequences  of 
500  East  Asian  strains,  found  that  441  (88.2%)  contained  a  canonical  EPIYA-ABD  motif 
(7).  Indeed,  additional  studies  confirmed  this  conservation  among  East  Asian  strains; 
greater  than  84%  of  the  examined  strains  across  all  three  studies  contained  an  EPIYA- 
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ABD  motif  (7,  12,  39).  Moreover,  in  our  molecular  epidemiologic  CagA  study,  the 
majority  of  strains  (87.5%)  contained  an  EPIYA-D  motif.  Interestingly,  those  isolates 
that  contained  a  nonstandard  EPIYA-ABD  motif  were  associated  with  development  of 
gastritis  (39).  This  finding  suggests  that  variation  in  East  Asian  cagA  is  not  as  favorable 
as  in  Western  isolates  and  that  variation  may  affect  disease  progression. 

The  reasons  for  strict  conservation  of  the  EPIYA-ABD  motif  are  unknown,  but 
may  be  explained  by  several  possible  theories.  One  theory  is  a  difference  in  the  degree  of 
selective  pressure  for  variation  imposed  on  Western  and  East  Asian  strains.  Western 
CagA  shows  a  lower  affinity  for  SHP-2  and  is  associated  with  less  severe  inflammation, 
host  cell  morphological  changes,  and  disease  as  compared  to  East  Asian  strains  (30,  31). 
Moreover,  there  is  a  dose  response  in  the  number  of  EPIYA-C  motifs  and  the  levels  of 
tyrosine  phosphorylation,  SHP-2  binding,  host  cell  morphological  changes,  and 
inflammation  induced  by  Western  CagA  (32,  39,  87).  Perhaps,  increased  inflammation 
enhances  colonization,  and  therefore  may  act  as  a  positive  selective  pressure  to  increase 
the  number  of  EPIYA-C  motifs.  This  pressure  would  not  be  experienced  by  East  Asian 
strains  since  the  EPIYA-D  motif  is  already  so  biologically  active.  However,  if  increased 
inflammation  is  important  for  colonization  of  H.  pylori,  then  there  may  be  a  selective 
pressure  to  keep  a  canonical  EPIYA-ABD  motif.  For  instance,  perhaps  extra  EPIYA-A 
or  -B  motifs  in  association  with  an  EPIYA-D  motif  more  strongly  activate  the  negative 
feedback  loop  that  results  from  EPIYA-A  or  -B  binding  to  Csk,  thereby  decreasing 
inflammation  (7,  82).  Additionally,  a  single  EPIYA-D  motif  may  be  optimal  for  SHP-2 
binding,  and  extra  EPIYA-D  motifs  may  contort  CagA’s  conformation,  thereby 
destabilizing  this  interaction,  again  resulting  in  a  decrease  in  inflammation.  The  true 
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reason  for  this  conservation  among  East  Asian  strains  could  help  elucidate  the  impact  of 
these  motifs. 

Work  on  the  EPIYA  motif  region  of  CagA  has  primarily  focused  on  differences  in 
phosphorylation  and  subsequent  modulation  of  phosphorylation-dependent  host  signaling 
pathways  (reviewed  in  (40).  Recently,  however,  a  CagA  multimerization  domain  was 
described  that  is  located  within  the  EPIYA  region,  and  therefore  varies  as  the  EPIYA 
motifs  vary.  Some  studies  suggest  that  this  domain  is  responsible  for  the  differential 
modulation  of  some  phosphorylation-independent  host  signaling  pathways  (Fig.  2;  (59, 
74,  77,  83).  However,  since  the  existence  of  this  domain  is  a  very  recent  discovery,  more 
work  is  needed  to  clearly  define  the  role  of  the  multimerization  domain  on  H.  pylori- 
induced  changes  in  host  signaling  pathways.  Additionally,  changes  in  this  domain  as  a 
result  of  changes  in  the  EPIYA  motif  will  need  to  be  investigated. 

While  we  have  gained  much  knowledge  about  the  role  of  the  C-terminus  of  CagA, 
not  much  is  currently  known  about  the  N-terminus  of  the  protein.  Moreover,  of  the 
studies  that  have  been  completed,  conflicting  data  have  arisen.  Specifically,  it  was 
demonstrated  that  the  N-terminus  of  CagA  is  responsible  for  directing  CagA  to  the 
plasma  membrane  (14),  but  other  data  demonstrated  the  EPIYA-motifs  located  in  the  C- 
tenninus  were  responsible  for  proper  localization  (32,  33).  Pelz  et  al.  recently 
demonstrated  that  two  independent  domains,  one  in  the  N-terminus  and  one  in  the  C- 
tenninus,  are  responsible  for  directing  CagA  to  the  plasma  membrane  (69).  In  fact,  these 
authors  showed  that  the  first  200  amino  acids  of  CagA  actually  act  as  an  inhibitory 
domain  that  dampens  the  host  response  to  the  C-tenninus  of  CagA.  This  domain  reduces 
activation  of  the  oncogenic  transcription  factor  P-catenin,  reduces  the  length  of  the 
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“hummingbird”  protrusions,  and  increases  the  speed  and  strength  of  new  cell  to  cell 
contact  (69).  Thus,  an  interesting  question  would  be  to  ask  if  the  activity  of  the  inhibitory 
domain  varies  in  conjunction  with  varying  motifs  in  the  C-tenninus.  In  other  words,  is 
the  effect  of  this  inhibitory  domain  different  based  on  the  variations  of  the  different  cagA 
alleles?  This  question  could  be  addressed  by  removing  the  inhibitory  domain,  by  deleting 
the  first  200  amino  acids  of  CagA  as  described  previously  (69),  within  the  context  of 
isogenic  strains  that  differ  only  in  the  EPIYA  region  of  cagA.  Differences  in  induced 
host  cell  morphological  changes  could  then  be  assessed  when  these  strains  were  used  to 
infect  AGS  cells,  and  fold  changes  in  the  number  of  elongated  cells  could  be  calculated 
and  compared  between  the  strains  and  their  isogenic  strains  lacking  the  inhibitory 
domain.  If  the  difference  in  fold  change  was  similar  across  the  different  EPIYA  isogenic 
strains,  this  would  suggest  that  this  inhibitory  domain  acts  independently  of  the  EPIYA 
motif.  Additionally,  levels  of  activated  P-catenin  could  be  measured  in  the  infected  host 
cells.  If  this  inhibitory  domain  is  influenced  by  the  EPIYA  motif  that  is  present,  it  could 
also  be  interesting  to  detennine  whether  there  is  also  an  interaction  with  the 
multermization  domain.  This  hypothesis  could  be  addressed  by  making  phosphorylation- 
resistant  mutants  within  the  above  mentioned  strains,  thereby  abrogating  the 
phosphorylation-dependent  pathways;  the  tyrosine  could  be  changed  to  an  alanine  or 
serine  through  site  specific  mutagenesis  (5,  21,  32).  The  multermization  domain  is 
important  for  the  activation  of  PARP-1.  Determing  the  levels  of  PARP-1  activation 
between  phosphorylation-resistant  isolates  containing  the  different  multermization 
domains  and  the  isogenic  strains  missing  the  inhibitory  domain  will  identify  the  impact,  if 
any,  of  this  inhibitory  domain  on  the  activation  of  CagA  phosphorylation-independent 
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pathways.  While  we  know  quite  a  lot  about  the  contribution  of  CagA  and  the  EPIYA 
motifs  to  H.  /?v /on-induced  disease,  there  are  clearly  more  questions  on  the  molecular 
level  that  need  to  be  answered. 

Vac  A 

Many  questions  still  remain  to  be  addressed  regarding  the  different  vac  A  alleles. 
Included  among  these  are  questions  concerning  the  signal  (s)  region  of  the  protein. 
Geographic  differences  between  the  si  region  have  been  reported,  and  three  subtypes 
have  been  identified:  sla,  sib,  and  sic  (11).  Are  these  differences  important  for  how 
VacA  acts  on  host  cells  and/or  its  interplay  with  other  virulence  factors?  Although  this 
nomenclature  is  now  seldom  used,  it  would  be  interesting  to  determine  if  the  different 
subtypes  display  any  functional  differences  in  activity,  since  the  s  region  is  responsible 
for  most  of  the  toxicity  of  VacA  (52-54,  72).  This  objective  could  be  accomplished  by 
creating  isogenic  strains  that  vary  only  in  the  si  subtype  or  more  simply  by  intoxicating 
eukaryotic  cells  with  identical  concentrations  of  purified  VacA  containing  one  of  the 
three  different  subtypes.  One  could  assess  the  ability  of  the  various  VacA  toxins  to  cause 
vacuoles  within  host  cells  as  well  as  induce  apoptosis,  which  could  be  measured  through 
activated  TUNEL  assays  or  by  measuring  the  amount  of  activated  caspase  3. 

Another  question  that  has  arisen  from  our  VacA  epidemiology  work  concerns  the 
overall  importance  of  the  middle  (m)  region  of  the  protein.  The  first  aspect  of  the  m 
region  that  should  be  addressed  is  the  association  between  the  m  region  and  gender.  Our 
work  was  the  first  study  to  observe  such  an  association  (37),  and  begs  the  question  of 
why  this  association  exists.  Females  appear  more  likely  to  be  infected  with  strains 
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encoding  the  m2  vacA  allele  (37).  Does  this  association  exist  in  populations  where  the 
m2  allele  is  more  prevalent,  such  as  in  regions  of  China  (67)  and  Poland  (49)?  If  it  does 
exist,  then  there  may  be  something  physiologically  different  between  the  gastric 
environment  of  males  and  females.  The  nature  of  these  differences  may  be  unknown, 
there  are  many  possibilities.  For  instance,  is  there  a  difference  in  the  pH  of  the  stomach 
acid  that  may  consequently  influence  disease  state?  Is  there  a  difference  between  the 
actual  gastric  epithelium  between  males  and  females?  Minute  differences  in  the 
thickness  or  composition  of  the  mucus  layer,  which  could  in  turn  impact  contact  of  H. 
pylori  with  the  gastric  epithelium,  could  affect  the  amount  of  toxin  delivered  to  host  cells. 
Are  there  differences  in  the  amount  or  type  of  adherence  receptors  expressed  in  the 
gastric  epithelium  of  males  versus  females?  Furthermore,  what  affect  does  the  endocrine 
system,  more  specifically  changes  in  honnone  levels,  have  on  this  process?  Recently,  Dr. 
Claire  Fraser-Liggett  has  identified  a  gastric  microbiome  and  is  currently  identifying 
differences  in  this  bacterial  population  between  individuals  (2009  Bullard  Lecture).  It  is 
possible  that  the  differences  in  eating  habits  between  the  sexes  influence  this  microbiome 
and  thereby  the  distribution  of  the  m  vac  A  allele.  Answers  to  these  questions  may  help  to 
explain  the  differences  in  distribution  of  the  ml  vacA  allele  between  men  and  women. 
Overall,  the  increased  cellular  tropism  of  the  ml  vacA  allele  (66),  the  finding  that  the 
presence  of  the  ml  region  increases  the  risk  for  gastric  cancer  (79),  and  the  fact  that 
patients  infected  with  H.  pylori  strains  encoding  for  the  m2  allele  are  more  likely  to  be 
female  (37)  may  explain  why  males  are  overall  more  likely  to  develop  gastric  cancer 
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The  second  aspect  of  the  m  allele  that  should  be  examined  is  its  contribution  to 
the  association  between  VacA  and  CagA,  as  well  as  between  VacA,  CagA,  and  disease 
state.  Does  the  association  between  the  m  region  and  the  cagA  allele  occur  simply 
because  the  vacA  regions  that  are  responsible  for  more  severe  disease  also  co-vary  among 
themselves?  This  possiblity  does  not  seem  likely  in  this  population,  since  we  only  found 
the  si  allele  and  were  still  able  to  detect  this  association.  Furthermore,  is  the  three-way 
association  between  the  m  region,  CagA,  and  disease  state  due  to  the  fact  that  the  m2 
allele  has  a  narrower  cell  tropism,  thereby  affecting  the  types  of  cells  VacA  can 
intoxicate?  If  this  were  the  case,  one  would  expect  to  see  a  direct  correlation  between  the 
m  region  and  disease  state.  However,  this  correlation  is  not  seen  within  this  South 
Korean  population. 

Several  questions  also  still  remain  regarding  the  intennediate  (i)  region  of  VacA. 
The  major  difference  between  the  il  and  i2  alleles  is  the  addition  of  three  polar  amino 
acids  (asparagine,  histidine,  and  serine)  in  Cluster  C  (37,  38).  Since  both  clusters  B  and 
C  have  been  suggested  to  affect  toxin  activity,  studies  directed  towards  understanding  the 
specific  role  of  these  amino  acids  in  vacuolating  activity  would  be  interesting.  In  order  to 
better  examine  this,  an  il  allele  could  be  genetically  engineered  through  the  addition  of 
only  these  three  amino  acids.  Perhaps,  it  would  not  be  surprising  if  the  addition  of  these 
amino  acids  decreased  toxin  activity,  since  it  has  been  demonstrated  that  additional 
amino  acids  near  the  cleavage  site  in  the  s  region  decrease  the  ability  of  the  toxin  to 
integrate  into  the  cellular  membrane,  which  in  turn  decreases  toxin  activity  (11,  53). 

Such  a  result  might  explain  why  the  i  region  has  been  suggested  to  be  a  better  predictor 


of  disease. 
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Another  aspect  of  the  i  region  worth  examining  involves  toxin  evolution  and  the 
presence  of  the  i3  allele.  Is  the  presence  of  the  i3  region  really  a  snapshot  of  the 
evolution  of  the  i  region  from  one  allele  to  another  (i  1  or  i2)?  It  would  be  interesting  to 
infect  animals  with  H.  pylori  strains  containing  the  i3  allele,  and  to  sequence  several 
recovered  isolates  after  infection  to  see  if  the  i3  allele  has  been  replaced  with  either  an  il 
or  i2  allele.  Since  evolution  is  not  a  rapid  event,  this  experiment  would  consist  of  long 
tenn  animal  infections  -  36  weeks.  Isolates  from  these  animals  could  be  sequenced,  but 
would  likely  be  used  to  subsequently  re-infect  new  animals,  as  multiple  passages  in  vivo 
would  likely  be  needed  to  evaluate  any  evolution  of  this  allele.  This  experiment  should 
be  perfonned  with  an  i3  strain  where  cluster  B  is  from  an  i  1  strain  and  cluster  C  is  from 
an  i2  strain,  as  well  as  an  i3  strain  where  cluster  B  is  from  an  i2  strain  and  cluster  C  is 
from  an  il  strain.  Results  from  these  experiments  could  indicate  if  there  is  an  overall 
selective  pressure  to  evolve  in  vivo,  and  if  the  original  cluster  sequences  influences  that 
evolution.  In  this  same  vein,  it  would  be  interesting  to  determine  if  there  is  a  functional 
difference  between  the  i3  allele  and  the  il  and  i2  alleles.  If  so,  is  there  a  functional 
difference  between  i3  strains  that  contain  cluster  B  from  an  il  strain  and  cluster  C  from 
an  i2  strain  as  compared  to  an  i3  strain  that  contains  cluster  B  from  an  i2  strain  and 
cluster  C  from  an  il  strain?  Alternatively,  is  toxicity  dependent  on  the  sequence  of  an 
individual  cluster? 

Finally,  it  would  be  interesting  to  look  at  a  population  containing  an  increased 
percentage  of  i2  alleles  in  order  to  assess  the  distribution  of  amino  acids  at  position  196 
on  the  vacA  allele  as  well  as  on  disease  state.  Our  work  demonstrated  that  the 
distribution  of  amino  acids  found  at  this  position  was  linked  to  more  severe  disease, 
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specifically  gastric  cancer  (38).  In  the  South  Korean  population  analyzed,  all  of  the  i2 
alleles  encoded  for  a  leucine  at  amino  acid  196  (38).  Thus,  perhaps  it  would  be 
interesting  to  analyze  a  population  where  the  i2  allele  is  more  prevalent  to  see  if  this  trend 
persists.  If  so,  the  amino  acid  found  at  this  position  may  be  partially  responsible  for  the 
results  of  a  previous  study  in  which  it  was  concluded  that  the  i  region  was  the  best 
predictor  of  disease  (75).  Further  studies  could  also  investigate  variation  in  the  major 
amino  acid  differences  seen  between  the  il  and  i2  alleles.  This  line  of  research  might 
better  indicate  which  amino  acids  are  critical  for  toxin  activity. 

Cag/Vac  Interaction 

Recent  studies  have  identified  an  association  between  the  cagA  allele  and  the 
vacA  allele  that  appears  to  affect//,  pylori  toxicity  and  disease  severity  (37,  84,  89). 
Infection  with  H.  pylori  strains  that  encode  for  CagA  and  sl/ml  VacA  result  in  highly 
active  corpus  gastritis  (57),  which  is  linked  to  the  development  of  gastric  cancer  (55-57). 
Our  study  also  found  an  association  between  the  cagA  allele  and  vacA  allele,  as  well  as  a 
three  way  association  between  the  distribution  of  the  cagA  and  vacA  alleles  and  disease 
state  (37).  Indeed,  in  our  South  Korean  population,  the  majority  of  H.  pylori  strains  carry 
the  most  toxic  form  of  both  VacA  and  CagA,  and  this  may  explain  the  high  rate  of  severe 
gastric  disease  among  the  South  Korean  population  (37). 

Conventionally,  one  might  think  that  both  toxins  concomitantly  exert  drastic 
effects  on  the  same  host  cell.  However,  recent  data  suggest  that  the  converse  is  true;  the 
presence  of  both  CagA  and  VacA  may  dampen  the  effect  of  each  protein  alone,  possibly 
leading  to  increased  survival  of  infected  host  cells  (9).  In  fact,  when  both  toxins  are 
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present,  there  is  less  VacA-induced  apoptosis  then  when  cells  are  intoxicated  with  Vac  A 
alone  (9).  Additionally,  eukaryotic  cells  intoxicated  with  both  toxins  demonstrate  less 
CagA-induced  morphological  changes  as  compared  to  cells  intoxicated  with  CagA  alone 
(9). 

Since  these  results  are  still  fairly  new,  many  questions  remain.  For  instance,  is 
there  a  direct  interaction  between  CagA  and  Vac  A,  or  more  likely,  is  this  effect  the  result 
of  activation  of  competing  pathways  by  the  two  toxins  (Fig.  2  and  3)?  If  there  is  a  direct 
interaction  between  these  two  proteins,  this  could  possibly  be  detected  by  performing 
pull-down  assays.  Does  Vac  A  somehow  amplify  the  function  of  the  newly  identified 
inhibitory  domain  in  the  N-tenninus  of  CagA  (69)?  Also,  in  thinking  about  the 
chronology  of  H.  pylori  infection,  does  the  bacterium  utilize  the  two  toxins  to  increase 
the  life  span  of  the  host  cell  and  thus,  to  prolong  infection?  Indeed,  it  seems  plausible 
that  the  most  severe  forms  of  gastric  disease  would  result  from  long  term  infection  of 
cells,  and  therefore,  long  tenn  //.  pylori- induced  inflammation.  In  terms  of  CagA  and 
Vac  A  “interaction,”  does  an  order  of  events  exist  that  is  important  for  the  resulting 
effects?  For  example,  since  VacA  is  secreted  while  CagA  is  injected  by  H.  pylori,  do 
cells  first  need  to  be  intoxicated  by  one  or  the  other  toxin  to  see  the  protective  correlate? 
This  question  could  be  directly  assessed  through  in  vitro  assays.  First,  eukaryotic  cells 
will  be  tranfected  with  cagA  under  control  of  an  inducible  promoter.  CagA  can  then  be 
induced  within  these  cells  and  then  intoxicated  with  VacA  at  various  time  points, 
followed  by  analysis  of  these  cells  for  induction  of  apoptosis  and  morphological  changes. 
Additionally,  is  the  damping  effect  of  the  toxins  achieved  by  reaching  a  threshold  of  both 
toxins,  or  is  the  mere  presence  of  any  amount  of  the  two  toxins  sufficient?  This  question 
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could  be  assessed  by  determining  if  there  is  a  dose  response  to  the  toxins.  For  instance, 
the  transfected  eukaryotic  cells  described  above  could  be  intoxicated  with  increasing 
amounts  of  Vac  A  and  a  set  amount  of  induced  CagA,  or  with  a  set  amount  of  Vac  A  and 
increasing  amounts  of  induced  CagA.  Eukaryotic  cells  would  then  be  analyzed  for  levels 
of  apoptosis  as  well  as  morphological  changes.  Additionally,  the  order  of  toxin  addition 
could  be  inversed,  based  on  the  findings  from  the  above  study  assessing  the  importance 
of  the  order  of  intoxication. 

Finally,  considering  CagA  and  VacA  “interaction”  in  the  context  of  our 
epidemiological  data,  what  are  the  physiological  consequences  of  intoxication  with  the 
different  CagA  or  VacA  variants,  or  combination  of  these  different  toxins?  For  instance, 
if  a  strain  carries  EPIYA-ABD  CagA  and  s2/i2/m2  VacA,  which  typically  shows  no  toxic 
activity,  are  the  effects  of  CagA  similar  to  a  strain  that  carries  an  EPIYA-ABD  CagA  but 
no  VacA?  Alternatively,  is  the  combination  of  Western  CagA  and  sl/il/ml  VacA  more 
or  less  lethal  to  cells  than  East  Asian  CagA  and  s2/i2/m2  VacA?  These  and  other  allele 
based  questions  could  be  addressed  by  making  VacA  isogenic  strains  within  the  context 
of  the  CagA  EPIYA  isogenic  strain  background  as  discussed  later  in  this  chapter.  Since 
evidence  is  increasing  that  the  association  between  the  cagA  allele  and  the  vacA  allele 
impact  disease  development,  I  strongly  believe  that  the  future  of  pathogenesis  studies  in 
II.  pylori  will  have  to  focus  on  the  effect  of  combinations  of  these  virulence  factors. 

A  Hierarchy  of  Virulence  Factors 

Another  emerging  theory  is  that  CagA  may  be  the  “master”  virulence  factor,  and 
that  other  virulence  factors  or  polymorphisms  are  important  only  in  the  context  of  which 
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cagA  allele  is  present  (13,  37,  38).  In  fact,  studies  have  found  that  in  tenns  of  gastric 
cancer,  the  cagA  allele  carried  is  the  most  important  bacterial  risk  factor  (15). 
Conversely,  the  i  region  of  VacA  is  the  best  predictor  for  duodenal  ulcers  (15).  Indeed, 
in  our  epidemiological  studies,  we  found  that  different  associations  existed  in  populations 
carrying  particular  cagA  alleles  and  that  these  associations  were  not  found  in  populations 
encoding  for  a  different  cagA  allele.  For  instance,  when  age  and  gender  were  taken  into 
account,  a  two  way  association  between  the  distribution  of  the  vacA  allele  and  disease 
state  was  found  only  within  the  EPIYA-ABD  CagA  population.  Furthermore,  non- 
sl/il/ml  vacA  alleles  were  associated  with  duodenal  ulcers  within  the  population 
carrying  the  East  Asian  EPIYA-ABD  CagA,  but  with  gastritis  in  the  population  carrying 
any  other  genotype  of  CagA  (37).  We  also  found  that  an  association  existed  between 
disease  state  and  amino  acid  23 1  of  the  VacA  i  region,  but  only  within  the  non  EPIYA- 
ABD  population  (38).  These  findings  again  suggest  that  the  effect  of  different  virulence 
factors  or  polymorphisms  within  these  virulence  factors  may  be  masked  by  which  cagA 
allele  is  present.  Indeed,  this  fact  may  help  explain  the  vast  amount  of  conflicting 
literature  concerning  the  importance  of  these  different  virulence  factors.  Employing  the 
statistical  technique  of  meta-analysis  to  survey  the  epidemiological  data  in  different 
geographic  regions  might  help  to  shed  light  on  some  of  these  reported  differences. 

Lessons  Learned  from  the  Current  Project 
Unfortunately,  the  major  molecular  biology  study  in  my  thesis  project 
encountered  problems  since  it  was  ultimately  discovered  that  the  isogenic  cagA  strains 
that  were  constructed  actually  contained  secondary  mutations.  While  we  do  not 
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understand  exactly  how  these  mutations  arose,  some  clues  may  come  from  thinking  about 
the  inherent  genetic  variability  of  the  bacterium.  In  fact,  H.  pylori  also  has  an  increased 
rate  of  spontaneous  mutations  as  compared  to  E.  coli,  with  initial  studies  demonstrating 
that  the  spontaneous  mutation  rate  in  H.  pylori  is  1 0"7- 1 0  s  (28,  35,  85).  Again,  this  rate 
varies  among  strains  and  a  rate  as  high  as  3  x  10"4  has  been  observed  (46).  In  fact, 
genetic  polymorphisms  seem  to  be  normal  between  strains.  In  a  study  that  examined  the 
genetic  sequence  of  a  house-keeping  gene  ( glmM)  it  was  found  that  the  sequence  was 
unique  in  all  the  strains  examined  (44).  Furthermore,  this  microdiversity  was  observed  in 
a  number  of  other  genes  (1,  3,  4,  34,  65,  78),  as  well  as  within  strains  taken  from  the 
same  patient  (36).  Additionally,  H.  pylori' s  genome  contains  multiple  genes  that  are 
phase  variable.  Indeed,  when  a  single  reference  strain  was  sequenced,  up  to  27  genes 
were  identified  that  contained  nucleotide  repeats  that  could  facilitate  phase  variation  (5 1 , 
81);  two  examples  of  these  genes  that  have  been  examined  in  more  detail  include  fliP 
(41)  and  oipA  (88). 

Animal  passage  of  strains  has  been  shown  to  induce  formation  of  large  numbers 
of  fragmented  genes  and  repeated  regions  (22).  Also,  Mongolian  gerbils  are  naturally 
infected  with  H.  bilis  and  II.  pylori  strains  are  naturally  competent.  In  fact,  the  H.  pylori 
has  the  ability  to  take  up  new  DNA  in  vivo,  which  creates  a  constant  chance  for  genetic 
exchange  since  a  host  can  be  infected  with  multiple  H.  pylori  strains  or  related  bacteria 
(24,  25,  36,  43).  This  phenomenon  has  been  well  documented  with  one  of  the  H.  pylori 
reference  strains  (J99).  When  an  archived  isolate  of  J99  was  compared  to  isolates  from 
the  same  patient  taken  6  years  apart,  there  was  a  high  level  of  genetic  diversity  (36). 
Collectively,  the  new  isolates  had  lost  up  to  2.3%  of  the  open  reading  frames  compared  to 
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the  archived  J99  strain.  Additionally  these  strains  had  gained  DNA  that  was  not  found  in 
the  original  J99  strain  (36).  Overall,  natural  competence  has  been  proposed  to  contribute 
to  the  vast  allelic  diversity  of  the  organism,  and  to  help  account  for  the  considerable 
genetic  variability  (6-7%)  that  is  seen  between  strains  (4,  26,  50,  80). 

In  order  to  reduce  the  potential  for  genetic  variation  that  could  affect  our 
experiments,  the  Merrell  group  has  adapted  certain  lab  protocols.  For  instance,  when  we 
create  a  mutant  strain,  we  select  a  single  colony  of  the  mutant  strain  and  then  never 
utilize  single  colonies  again.  Bacteria  are  expanded  as  patches  of  cells  from  the  freezer 
stock  (-80°C)  on  antibiotic-supplemented  horse  blood  agar  plates  for  36-42  hours,  which 
is  the  minimal  amount  of  time  for  growth.  Bacteria  are  then  expanded  as  lawns  from 
these  patches  for  about  20  hours  on  plates.  The  lawn  is  then  used  to  inoculate  18  hour 
liquid  starter  cultures  that  are  ultimately  used  to  inoculate  OD  controlled  experimental 
liquid  cultures.  All  of  these  protocols  are  performed  in  an  attempt  to  minimize  the 
number  of  lab  passages  of  the  strain,  and  to  make  sure  that  if  genetic  variability  occurs,  it 
does  so  in  the  context  of  a  population  of  cells.  Furthermore,  when  feasible,  we  also 
create  an  independent  biological  isolate  of  all  mutant  strains. 

In  devising  our  isogenic  strain  study,  we  first  created  a  AEPIYA  strain,  which  was 
used  as  the  strain  background  for  all  subsequent  strains.  Moreover,  we  followed  the 
aforementioned  lab  protocols  for  expanding  bacteria  from  freezer  stocks,  transformation, 
selection,  and  growth,  which  were  designed  to  minimize  the  possibility  of  variation. 
Therefore,  there  was  no  reason  to  believe  that  these  strains  would  contain  secondary 
mutations.  However,  upon  reflection  on  the  project,  there  were  several  different  points 
throughout  the  process  when  the  data  suggested  that  there  might  be  a  secondary  mutation 
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that  would  complicate  our  study.  Sequence  analysis  as  well  as  growth  dynamics 
suggested  that  the  strains  were  in  fact  isogenic.  However,  the  first  indication  that  the 
strains  were  not  behaving  as  expected  came  with  the  quantification  of  CagA  expression. 
Theoretically,  all  of  the  isogenic  strains,  with  the  exceptions  of  the  A cagA  and  AEPIYA 
strains,  should  have  expressed  CagA  at  similar  levels.  However,  up  to  a  two-fold 
difference  could  be  seen  between  the  EPIYA-AB'CCCC  and  wild  type  strains  (Fig.  16). 
Conversely,  there  were  no  major  differences  when  CagA  expression  was  compared  to  the 
restorant  strain,  which  I  believed  to  be  a  reasonable  comparison;  the  restorant  strain 
should  be  genetically  identical  to  the  7.13  wild  type  strain,  and  had  undergone  the  same 
genetic  manipulation  as  the  other  isogenic  strains.  Minor  differences  in  the  restorant 
suggested  that  perhaps  the  genetic  manipulation  of  the  strain  slightly  altered  the  overall 
expression  level  of  CagA.  At  the  time,  we  considered  this  as  no  surprise;  however,  in 
hindsight,  this  result  may  have  been  the  first  indication  that  there  were  problems  with  the 
strains. 

While  CagA  has  not  been  shown  to  affect  the  adherence  of  H.  pylori  (5),  any 
difference  in  adherence  of  strains  to  host  cells  would  potentially  alter  the  amount  of 
CagA  that  could  be  translocated  and  phosphorylated,  thereby  changing  the  deregulation 
of  host  cell  signaling  pathways  and  potentially  affecting  development  of  gastric  disease. 
In  our  studies,  the  adherence  assay  was  the  first  assay  that  showed  marked  differences 
between  the  isogenic  strains.  Indeed,  the  restorant,  EPIYA-AB'C,  and  wild  type  strains, 
which  should  have  all  adhered  at  similar  levels,  did  not  behave  as  expected;  the  restorant 
strain  was  10-fold  less  adherent  than  the  wild  type  or  EPIYA-AB'C  strains.  When  these 
preliminary  results  showed  the  lack  of  consistency  between  the  restorant,  EPIYA-ABlC, 


382 


and  wild  type  strains,  we  sought  to  rule  out  the  potential  confounding  effect  of  slight 
differences  in  the  growth  phase  of  the  different  isogenic  strains  (Fig.  15).  Re¬ 
examination  of  the  growth  curves  suggested  that  at  1 8  hours  some  of  the  isogenic  strains 
may  have  entered  stationary  phase,  which  could  adversely  affect  the  adherence  of  the 
bacteria  to  the  AGS  cells.  We  therefore  repeated  the  assay  using  12  hour  liquid  cultures 
to  infect  the  AGS  cells.  We  found  that  while  the  number  of  adherent  bacteria  was 
increased,  the  trends  stayed  the  same,  suggesting  that  there  were  secondary  mutations 
within  the  strains  (Fig.  18). 

We  next  assessed  the  ability  of  the  strains  to  deliver  CagA  to  host  cells,  where  it 
is  phosphorylated  by  host  cell  kinases,  thereby  causing  morphological  changes  within  the 
cells.  While  there  were  slight  differences  in  the  peak  phosphorylation  time  of  CagA 
between  the  strains,  the  trends  across  the  strains  were  the  same.  However,  assessment  of 
the  five  hour  time  point,  which  was  a  time  point  shown  in  a  previous  study  to  allow 
detection  of  high  levels  of  phosphorylation  (39),  presented  more  evidence  that  there  were 
secondary  mutations  in  our  strains.  While  increasing  numbers  of  EPIYA-C  motifs 
corresponded  to  increasing  amounts  of  phosphorylated  CagA,  the  three  isogenic  strains 
that  contained  only  one  EPIYA  motif  (EPIYA-ABlC,  EPIYA-AB'D,  and  the  restorant) 
showed  similar  levels  of  phosphorylation  (Fig.  19).  Unfortunately,  these  levels  of 
phosphorylation  were  2.5  times  less  than  the  amount  of  phosphorylated  CagA  from  the 
wildtype  strain  (Fig.  19).  This  fact  combined  with  the  fact  that  the  isogenic  strains 
expressed  approximately  70%  of  the  amount  of  CagA  that  the  wild  type  strain  expressed, 
immediately  suggested  that  there  was  something  different  among  the  isogenic  strains. 
Additionally,  the  levels  of  phosphorylation  did  not  translate  into  the  expected  changes  in 
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host  cell  morphology.  Once  again,  the  wild  type,  EPIYA-ABlC,  and  restorant  strains  all 
behaved  differently  (Fig.  20)  suggesting  that  there  was  a  second  site  mutation  within  at 
least  some  of  the  strains. 

Based  on  the  long  length  of  the  experiments,  the  large-scale  animal  study  was 
regrettably  started  prior  to  the  completion  of  the  in  vitro  characterization  of  the  isogenic 
strains.  The  animal  study  showed  that  there  was  a  complete  lack  of  differences  in  disease 
progression  among  Mongolian  gerbils  infected  with  the  different  isogenic  strains. 
Moreover,  there  was  a  drastic  difference  in  the  pathology  induced  in  animals  infected 
with  the  wild  type  strain  as  compared  to  the  EPIYA-AB'C  strain.  This  fact  alone 
suggests  that  there  is  something  fundamentally  different  between  these  two  strains  that  is 
not  CagA -related.  Knowing  what  we  know  now,  ideally  a  small  pilot  experiment  of  eight 
to  ten  weeks  duration  with  fewer  numbers  of  animals  should  have  been  conducted  before 
proceeding  with  the  large  scale  animal  study. 

Additionally,  all  H.  pylori  infection  groups  progressed  to  gastritis  and  eventually 
gastric  cancer  at  the  same  rate  as  gerbils  infected  with  the  A cagA  strain  (Fig.  22),  which 
has  been  shown  previously  to  cause  no  gastric  cancer  in  this  model  (23).  This  finding  led 
to  the  re-creation  of  the  AEPIYA  strain  followed  by  the  creation  of  a  new  restorant  strain. 
This  new  restorant  strain  induced  pathology  similar  to  the  wild  type  strain  in  gerbils  six 
weeks  after  infection;  these  animals  displayed  dysplasia  and  were  progressing  to  gastric 
cancer  (Fig.  23).  This  result  combined  with  all  the  inconsistencies  in  the  in  vitro  work 
suggests  that  there  was  a  second  site  mutation  within  the  original  AEPIYA  strain; 
therefore,  no  correlations  could  be  made  and  there  was  no  reason  to  further  characterize 


these  strains. 
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Since  that  time,  we  have  learned  that  strain  7.13  loses  in  vivo  virulence  after  low 
number  of  lab  passages  (R.  Peek,  personal  communication).  Additionally,  other  work  in 
our  lab  has  shown  drastic  in  vivo  differences  for  strains  that  have  been  idenitically 
manipulated  to  delete  the  same  gene.  Knowing  this  now,  it  might  have  been  prudent  to 
measure  the  mutation  rate  of  7. 13  prior  to  beginning  our  studies  in  order  to  determine 
whether  this  strain  has  a  higher  mutation  rate  than  other  strains  of  H.  pylori. 

Looking  Forward 

Since  H.  pylori  strain  7.13  was  too  genetically  unstable  to  use  for  these  studies,  is 
there  a  future  for  this  project?  I  believe  that  there  is  and  would  propose  the  following 
possibilities:  1)  use  strain  G27  for  in  vitro  characterization  and  2)  use  strain  PMSS1  for 
both  in  vitro  and  in  vivo  characterization.  G27  is  our  lab’s  commonly  used  reference 
strain  and  has  been  shown  to  be  fairly  genetically  stable.  Currently,  all  of  the  CagA 
isogenic  strains  have  been  created  in  the  G27  background,  and  these  strains  can  now  be 
used  to  determine  the  effect  of  the  EPIYA  motifs  on  host  cell  signaling.  The  EPIYA 
motif  region  of  these  strains  has  been  sequenced  to  verify  the  genetic  changes.  To  this 
end,  if  I  was  completing  these  studies,  I  would  again  start  by  assessing  growth  kinetics  of 
these  strains,  as  well  as  analyzing  expression  of  CagA  via  Western  blot  analysis.  The 
Western  blot  analysis  should  identify  any  differences  in  CagA  expression  that  could 
complicate  future  experiments.  Next,  I  would  look  at  interaction  of  the  strains  with  host 
cells  by  measuring  bacterial  localization,  adherence,  internalization,  CagA 
phosphorylation,  and  induced  host  cell  morphological  changes.  Though  I  would  initiate 
these  studies  using  AGS  cells,  it  might  also  be  interesting  to  assess  another  gastric  cell 
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line  that  is  able  to  form  cellular  barriers  (HGE  cells)  or  a  cell  line  that  can  be  polarized 
(T84  cells);  each  have  been  used  to  study  the  dynamics  of  H.  pylori  infection.  Using  the 
protocols  in  chapter  five  as  a  starting  point,  each  assay  would  need  to  be  optimized  for 
the  change  in  H.  pylori  strain  background. 

After  completion  of  the  basic  characterization,  modulation  of  host  cell  signalling 
pathways  could  be  assessed.  This  should  initially  be  assessed  through  analyzing  of 
changes  in  host  cell  morphology,  as  described  in  chapter  five.  However,  there  are  some 
changes  that  might  be  considered.  One  concern  is  that  we  perhaps  did  not  ever  achieve 
the  maximal  percent  of  elongated  cells  in  our  assay;  subsequent  smaller  experiments 
showed  that  the  highest  percent  of  elongated  cells  was  observed  between  12-18  hours 
post  infection.  Additionally,  while  the  percent  of  elongated  cells  tapered  off,  in  a  single 
experiment  I  conducted,  the  length  of  the  protrusions  in  elongated  cells  infected  with  the 
EPIYA-ABlD  strain  still  continued  to  increase  even  at  18  hours.  These  results  could 
indicate  that  the  EPIYA-AB'D  strain  may  elicit  reactions  more  slowly,  yet  its  overall 
effects  may  be  more  drastic.  Thus,  it  would  be  wise  to  explore  additional  time  points, 
such  as  24  and  36  hours  post  infection,  to  assess  this  possiblity.  Of  course,  if  these 
strains  take  longer  to  adhere  or  if  they  adhere  at  much  lower  levels,  then  the  MOI  or  time 
points  studied  post  infection  would  have  to  be  adjusted  accordingly.  Alternatively,  as 
noted  in  chapter  five,  the  trends  for  phosphorylation  of  CagA  was  similar  for  all  isogenic 
strains,  so  this  may  be  a  measurement  of  the  continued  activation  of  SHP-2.  This  SHP-2 
activation  difference  would  be  important  if  activation  of  other  phosphorylation  dependent 
cellular  pathways  were  analyzed. Once  the  basic  charaterization  of  these  strains  is 
complete,  modulation  of  specific  pathways  could  be  assessed.  For  example,  Erk 
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activation  could  be  assessed  via  Western  blot  analysis,  as  could  activation  of  NF-kB. 
Localization  of  [3-catenin  to  the  nucleus  could  be  assessed  via  microscopy  or  Western 
blot  analysis  of  nuclear  extracts  of  infected  cell  lysates.  Additionally,  since  there  is  a 
high  degree  of  genetic  variablity  between  strains,  these  strains  could  be  created  within 
other  reference  strains,  such  as  J99,  26695,  or  HP  AG-1  to  assess  the  role  of  the  EPIYA 
motifs  in  vitro.  An  additional  alternative  for  moving  this  project  forward,  could  be  the 
use  of  the  parental  strain  (PMSS1)  of  the  mouse  derivative  Sydney  strain  1  (SSI). 

PMSS1  has  recently  been  characterized  (10,  47),  and  colonizes  mice,  but  at  a  lower  level 
than  SSI  (10).  PMSS1  encodes  for  a  functional  CagA  protein,  whereas  SSI  does  not 
express  or  deliver  functional  CagA  (10,  71).  Not  only  does  PMSS1  produce  a  functional 
CagA,  but  this  strain  has  been  shown  to  cause  severe  pathology  in  mice,  including 
atrophy,  hyperplasia,  and  metaplasia  (10).  Since  our  knowledge  about  this  strain  is 
limited,  and  it  is  known  to  lose  the  ability  to  inject  CagA  into  host  cells  after  1  month  in 
vivo  (10),  it  might  be  wise  to  measure  the  spontaneous  mutation  rate  of  PMSS1  before 
conclusively  deciding  to  use  it  for  isogenic  strain  construction.  Ideally,  there  would  be  a 
H.  pylori  strain  that  infected  animals,  delivered  CagA,  causes  severe  pathology  (gastric 
cancer),  and  is  genetically  stable;  however  in  the  absence  of  this  ideal  strain,  PMSS1 
may  allow  us  to  identify  the  role  of  the  EPIYA  motifs  in  vivo.  Provided  preliminary 
results  are  satisfactory,  new  primers  to  amplify  the  upstream  (5’)  cagA  region  and  the 
downstream  (3’)  cagA  region  would  need  to  be  designed  in  order  to  create  the  constructs 
needed  to  produce  the  isogenic  strains.  Then,  in  vitro  characterization  of  the  newly 
constructed  strains  could  proceed  as  described  above.  Once  these  assays  are  completed,  a 
small  pilot  animal  study  to  assess  colonization  load,  histology,  and  the  timeline  for 
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disease  progression  should  be  completed.  Overall  the  utilization  of  this  strain  could 
provide  another  option  to  explore  the  role  of  the  EPIYA  motifs  not  only  in  vitro  but  also 
in  in  vivo  assays. 

Importance  and  the  Impact  of  Future  Studies 

Gastric  cancer  is  still  the  second  most  common  cause  of  cancer  morbidity  and 
mortality,  and  this  could  be  reflective  of  the  high  incidence  of  H.  pylori  infection  (20,  60, 
68,  86).  It  could  also  be  a  result  of  the  high  prevalence  of  cagA  in  many  H.  pylori  strains, 
or  due  to  the  presence  of  certain  CagA  polymorphisms  that  predominate  in  geographic 
areas  that  have  high  rates  of  gastric  cancer  (2,  18,  20,  27,  29,  45,  86).  Due  to  the  fact  that 
we  do  not  yet  thoroughly  understand  the  process  of  H.  pylori  induced  pathogenesis, 
including  development  of  gastric  cancer,  elucidation  of  virulence  factors  or  virulence 
factor  polymorphisms  that  impact  disease  is  imperative.  Epidemiological  studies  are 
traditionally  good  indicators  of  trends  and  serve  as  a  starting  point  for  molecular  studies. 
Unfortunately,  H.  pylori  is  an  organism  that  shows  a  high  rate  of  genetic  variability, 
which  limits  the  impact  of  traditional  epidemiological  studies.  Thus,  in  order  to  elucidate 
the  exact  role  of  virulence  factors  or  polymorphisms,  it  is  best  to  assess  differences  by 
creating  isogenic  strains. 

The  successful  creation  of  EPIYA  isogenic  strains  will  not  only  answer  the 
question  as  to  what  role  the  EPIYA  motifs  play  in  disease  manifestations,  but  will  open 
the  door  to  the  assessment  of  multiple  virulence  factors.  For  instance,  it  has  already  been 
demonstrated  that  CagA  is  a  “master”  virulence  factor  (13,  37,  38),  and  that  there  is  an 
association  between  other  virulence  factors  and  disease  among  the  different  cagA  alleles. 
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This  is  especially  true  for  different  vacA  alleles  (38,  42).  Thus,  once  the  EPIYA  isogenic 
strains  are  created  and  characterized,  they  could  then  be  used  as  the  parental  strain 
background  to  create  isogenic  strains  containing  different  polymorphic  forms  of  other 
virulence  factors.  This  would  allow  us  to  not  only  assess  the  role  of  different  virulence 
factors  in  disease,  but  also  their  role  in  disease  development  within  the  context  of  a 
particular  cagA  allele. 

If  I  were  to  undertake  these  assays  personally,  I  would  focus  first  on  VacA.  As 
discussed  earlier,  VacA  is  polymorphic  and  different  vacA  alleles  impact  disease 
differently  (15,  37,  48,  58).  Thus,  we  could  create  vacA  isogenic  strains  within  certain 
cagA  isogenic  backgrounds  to  more  conclusively  look  at  CagA  and  VacA  interaction. 
Specifically,  I  would  alter  the  vacA  alleles  within  the  context  of  the  EPIYA-AB'C, 
EPIYA-ABlCCC,  EPIYA- ABlD  and  the  restorant  strains.  Experiments  with  these  strains 
would  identify  any  differences  between  East  Asian  (EPIYA-AB'D)  and  Western 
(EPIYA-AB'C  and  EPIYA-ABlCCC)  strains.  They  would  also  indicate  if  the  number  of 
EPIYA-C  motifs  influences  the  effects  of  the  different  vacA  alleles.  The  restorant  strain 
would  provide  an  important  control  for  genetic  manipulation  of  our  strains.  These  types 
of  studies  would  allow  our  lab  to  not  only  examine  the  role  of  the  different  vacA  alleles, 
but  to  assess  the  impact  of  these  alleles  within  the  context  of  the  cagA  allele. 

Furthermore,  these  types  of  studies  may  also  clarify  some  of  the  discrepancies  in  the 
literature  as  to  the  impact  of  different  vacA  regions  (37,  75). 

Additionally,  I  would  likely  focus  on  two  other  polymorphic  virulence  factors  that 
have  been  implicated  in  disease  development,  HomB  (19,  42,  64)  and  OipA  (16,  61). 

The  Helicobacter  outer  membrane  (Horn)  proteins  are  complex  because  H.  pylori  has  two 
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loci  that  can  encode  for  a  Horn  protein;  strains  can  encode  for  either  homA,  homB, 
homA/homB,  homA/homA,  homB  /homB,  or  be  negative  for  both  homA  and  homB  (19,  42, 
64).  The  presence  of  homB  has  been  linked  to  the  development  of  more  severe  disease, 
as  compared  to  the  presence  of  homA  (19,  42,  64).  Additionally,  a  dose  response  has 
been  identified  for  strains  encoding  for  homB/homB  (62,  63).  OipA  is  an  outer 
membrane  protein  whose  expression  has  been  shown  to  be  subject  to  phase  variation  due 
to  the  number  of  CT  repeats  found  in  the  oipA  signal  sequence  (6).  OipA  “on”  is  used  to 
designate  a  strain  that  expresses  a  functional  protein  (16,  61).  Again,  strains  that  encode 
for  an  OipA  protein  are  associated  with  more  severe  disease  outcomes  (16,  61). 

Moreover,  the  OipA  “on”  phenotype  is  often  found  in  cagA  positive  strains  (6).  The 
impact  of  homB  and  oipA  could  be  assessed  individually  or  in  the  context  of  different 
virulence  factors.  In  other  words,  besides  creating  isogenic  derivative  of  these  factors 
within  the  EPIYA-AB'C,  EPIYA-ABlCCC,  EPIYA-AB'D  and  restorant  strains  to  identify 
difference  among  the  different  cagA  alleles,  they  could  also  be  assessed  within  the 
EPIYA  isogenic  strains  that  carried  different  vacA  alleles.  For  instance,  I  would  likely 
first  assess  the  EPIYA- AB'C,  EPIYA- AB'CCC,  EPIY  A-ABlD  and  the  restorant  strains 
that  were  sl/il/ml  and  s2/i2/m2.  This  would  help  limit  the  number  of  strains,  especially 
since  there  are  six  different  horn  combinations  that  could  be  assessed.  This  would  also 
seem  to  be  the  most  likely  place  to  observe  differences,  since  the  sl/il/ml  is  the  most 
virulent  and  the  s2/i2/m2  is  the  least  virulent  vacA  allele.  If  a  difference  between  these 
populations  was  observed,  isogenic  strains  within  the  different  vacA  alleles  could  then  be 
created  and  assessed.  This  system  would  also  allow  us  to  systematically  examine  these 
virulence  factors  in  the  context  of  the  whole  bacterium,  and  to  assess  their  role  in  disease 
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progression,  including  progression  to  gastric  cancer.  This  would  also  be  a  better  system 
to  analyze  the  role  of  these  different  virulence  factors  in  the  context  of  geographical 
differences,  which  could  really  help  expand  the  field  of  H.  pylori  pathogenesis. 

Conclusions 

H.  pylori  is  a  medically  important  pathogen  that  has  successfully  challenged  the 
preconceived  idea  that  bacteria  cannot  cause  gastric  disease.  However,  more  than  a  few 
questions  remain  about  this  process  as  well  as  what  host,  environmental,  and  bacterial 
factors  are  important  for  the  progression  to  severe  disease.  This  thesis  focused  on  the 
bacterial  toxins,  CagA  and  VacA,  and  their  role  in  influencing  progression  to  more  severe 
disease.  To  that  end,  we  were  the  first  to  statistically  link  a  specific  cagA  allele  (EPIYA- 
ABD)  to  gastric  cancer  development.  We  were  also  able  to  analyze  this  large  population 
of  clinical  isolates  for  the  role  of  VacA  polymorphisms,  and  found  that  while  the 
distribution  of  vacA  alleles  was  not  directly  associated  with  disease  state,  it  was 
associated  with  the  distribution  of  cagA  alleles  and  in  a  three-way  association  that 
included  the  vacA  allele,  the  cagA  allele  and  disease  state.  During  the  course  of  this 
work,  we  identified  an  amino  acid  (196)  important  for  the  development  of  gastric  cancer 
within  the  intermediate  region  of  VacA.  Additionally,  we  identified  some  associations 
that  were  CagA-dependent,  such  as  the  association  of  VacA  and  disease  state  in  the 
EPIYA-ABD  population  and  amino  acid  distribution  at  position  231  and  disease  state  in 
the  non  EPIYA-ABD  population.  Finally,  we  were  able  to  optimize  techniques  that  will 
be  used  in  future  studies  aimed  at  characterizing  CagA  isogenic  strains. 
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While  little  is  currently  understood  about  why  some  H.  pylori  infected  individuals 
develop  gastric  cancer  while  others  remain  asymptomatic,  the  data  gathered  during  the 
course  of  this  thesis  will  help  shed  some  light  on  the  pathogensis  of  H.  pylori- induced 
disease.  Indeed,  the  elucidation  of  bacterial  factors  that  are  involved  in  the  pathogensis 
of  H.  pylori- induced  disease,  such  as  EPIYA-ABD  CagA,  is  important,  because  they  can 
serve  as  a  diagnostic  marker  of  infection  with  a  more  virulent  strain.  Understanding  any 
hierarchy  of  virulence  factors  is  imperative,  because  more  evidence  has  accumulated  that 
underscores  the  fact  that  colonization  with  H.  pylori  is  protective  against  other  illnesses, 
including  asthma  (73),  tuberculosis  (70)  and  esophageal  cancer  (reviewed  in  (17).  Thus, 
suggesting  that  treatment  should  only  be  used  for  individuals  infected  with  highly 
virulent  strains.  Such  treatments  that  only  target  patients  infected  with  strains  that  could 
cause  severe  disease  would  be  akin  to  geographically  personalized  treatment,  which  I 
believe  is  the  future  for  treating  II.  pylori- induced  disease.  Implementing  such  location- 
specific  treatments  could  aid  in  eliminating  a  majority  of  gastric  cancer  mortality  and 
morbidity  worldwide  without  sacrificing  the  protective  effects  provided  by  infection  with 
H.  pylori. 
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